Faster HashMap implementation (original) (raw)

Robert DiFalco rdifalco at tripwire.com
Sat Jun 13 16:45:55 UTC 2009

Previous message: Faster HashMap implementation
Next message: Faster HashMap implementation
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Sorry for inserting myself into this thread but this has been an interesting conversation filled with a lot of trade-offs and optimization techniques for edge cases. It occurred to me that one of the problems is that the traditional HashMap implementation relies on inheritance rather than composition. For example, if the hash map representation where abstracted out from the interface then you could by default construct a HashMap using the existing approach but alternatively construct it with another approach. If you further abstracted the hashing strategy into its own class then you would have even more flexibility composing a HashMap. Since maps are so often resized you could even have a dynamic map that starts out with open addressing but keeps count of collisions and on resize changes to a more optimal scheme for the current data.

This allows you to have a decent default HashMap but allows clients to have an optimal hash map if they know they will allows have Long or String or whatever keys versus doubles, floats, etc.

With the hashing strategy, an identity map becomes as easy as mixing in an identity hashing strategy during composition.

Anyway, just a thought, I'm sure it's been considered before but I just thought I'd throw it out there.

-----Original Message----- From: core-libs-dev-bounces at openjdk.java.net [mailto:core-libs-dev-bounces at openjdk.java.net] On Behalf Of Doug Lea Sent: Saturday, June 13, 2009 5:31 AM To: Ismael Juma Cc: core-libs-dev at openjdk.java.net Subject: Re: Faster HashMap implementation

Ismael Juma wrote:

Out of curiosity, do you know if any tests have been done with an open addressing scheme similar to Python dictionaries[1] in Java? Near the top, there's a comment explaining how it works and the motivation.

More or less. This is roughly similar to the schemes I mentioned a few days ago, of first direct-addressing. The python approach (as noted in its comments) has most trouble with cases that occur too often in Java to ignore: Poor entropy in low bits (the hashCodes for commonly used Floats and Doubles are especially problematic (Aside: you wouldn't think these would ever be used keys, but they are)) and much-greater-than-expected duplicate codes (mainly the particular value zero). However, these do seem amenable to one of the hybrid approaches I mentioned; I'll try to get a full version together soon.

-Doug

Previous message: Faster HashMap implementation
Next message: Faster HashMap implementation
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the core-libs-dev mailing list