Faster HashMap implementation (original) (raw)

Doug Lea dl at cs.oswego.edu
Mon Jun 8 16:07:05 UTC 2009

Previous message: Faster HashMap implementation
Next message: Faster HashMap implementation
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

A few notes on your implementation...

Alex Yakovlev wrote:

Main trick is in storing all data in 2 arrays without any HashEntry objects. One is array of Objects with keys and values, another in a complex array of indices.

Notice that footprint comparisons with current HashMap are load-dependent. Let C be capacity and N be # items. Assume 32bit pointers. Yours requires about 4C storage versus 1C + 5N for HashMap. 4C = key + value + main part of index + overflow part of index, although a little less with loadFactor < 1 (see below) 5N = 4 fields per entry plus 1 word Object header. There is a crossover point where yours uses less memory, which will often be the case when grown using the usual resizing thresholds. Plus GC should be faster in yours. However, statistically, most HashMaps have very few elements. I once instrumented some of the "reference workload" usages and found that over half of the HashMaps/Hashtables had a max 3 or fewer elements during their lifetimes. So on average using your version will likely increase application footprints. It seems possible to deal with this though. For example, you might consider separately allocating the back-half of index array only when needed.

The unloaded overhead is worse for your Linked version, although that is probably less of a concern.

It is very difficult to empirically compare footprints in the most interesting cases, of using a bunch of HashMaps of various sizes, in part because GC may or may not actually collect all collectable garbage.

Now I just force load factor to 1 if argument is greater than 1.

The effects of loadFactors are pretty much incommensurate across versions. Because you trade element-space for index-space, varying loadFactor have little bearing on total space. (But still a little, since an index costs half of a key+value). While it remains approximately true in yours that loadFactors < 1 are the average number of traversals to find existing key near the resize threshold point, probably some more work/thought is needed to make a better story about interpreting them internally.

When looking for missing key my implementation can be twice faster in some situations,

Right. In the tests I ran, yours is generally faster for failed lookups. It is also sometimes faster for successful lookups for some key types, distributions, and/or machines I test on. (For some reason, more often better on AMDs than Intels or Sparcs).

It strikes me that there might be a bigger win available here by using some hybrid strategy based on IdentityHashMap-like single-indirect (i.e., usually at most a single cache miss) for collision-free accesses but using a variant of your clever packed-indices under collisions. I might try this out someday.

Iteration over elements is a sequental array read which is faster. To clone such HashMap we only need to clone 2 arrays with is also much faster.

I don't understand why you don't simply traverse the element array for HashMap iterators? You cannot do this for your linked version, but it would not hurt to make completely different iterator types.

Thus iteration order is more predictable and I think it's possible not to throw ConcurrentModificationException in most situations.

For non-concurrent collections, the goal of ConcurrentModificationExceptions is to help people find/fix incorrect code; not necessarily to only blow up if proceeding is impossible.

I hope this helps! Despite all the concerns above, this is the first alternative version of HashMap that I can recall seeing that seems promising enough to continue exploring as a possible replacement.

-Doug

Previous message: Faster HashMap implementation
Next message: Faster HashMap implementation
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the core-libs-dev mailing list