Faster HashMap implementation (original) (raw)

Alex Yakovlev alex14n at gmail.com
Wed Jun 3 23:21:51 UTC 2009

Previous message: Cant detect deadlock when thread reenters synchronization block in Object.wait()
Next message: Faster HashMap implementation
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hello,

While playing with scala programming language I've come to a HashMap implementation that appears to be faster than current java.util.HashMap, also with a lower memory footprint.

Alex Miller advised me to post about it in this maillist.

I've put current implementation (both scala and java) and some benchmark results at github: http://github.com/alex14n/CompactHashMap

Current java version implements most of the external HashMap behaviour inluding serialization, but does not throw ConcurrentModificationException and does not implement internal methods that can be requeired for some subclasses.

Main trick is in storing all data in 2 arrays without any HashEntry objects. One is array of Objects with keys and values, another in a complex array of indices.

Index array has 2 parts. First part contains hashcode-to-index mappings, second part is index-to-index mapping for elements with the same hashcode. The rest bits are used for:

'Deleted' flag to mark empty spots after key removal (other bits part of such values is used as deleted-linked-list indices)
'End-of-list' flag allowing not to access next index if this element is last.
Some bits of hashcode used to compare with hashcode of the key we are looking for.

Finding an existing value for a key has approximately the same speed as the currrent HashMap implementation. It is:

Computing element hashcode,
Accessing array to get index/address for this hashcode 3a. Current HashMap access HashEntry object with key/hashcode/value/next properties 4a. In my implementation hashcode and next are packed in index array bits, key and value are stored in adjacent cells of the second array.

When looking for missing key my implementation can be twice faster in some situations, since it's only one random memory access (element index array). If it's empty, or its hashcode bits are not equal to hashcode bits of the key being searched, and it's end-of-list bit is set - it is enough to stop searching. Current HashMap needs 2 random memory reads: (1) array cell with address and (2) HashEntry at that address.

Insertion of a new key does not require any memory allocation (new HashEntry object) since all arrays are pre-allocated except for resize operation. Resize consists of Arrays.copyOf to a twice large array and almost sequental rehashing - reading one array and writing it to either the same index or with 1 in highest bit.

When adding several key/values they are sequentally written to key/values array, and sequental memory read is faster and more cache-friendly.

Having 2 arrays instead of a lot of HashEntry objects is more garbage collector friendly and has significantly lower memory footprint.

Iteration over elements is a sequental array read which is faster. To clone such HashMap we only need to clone 2 arrays with is also much faster.

Elements order is preserved even after resize, only when some key is removed it leaves an empty 'hole' that will be filled after next key insertion. Thus iteration order is more predictable and I think it's possible not to throw ConcurrentModificationException in most situations. Iterator needs only 1 parameter - array index - and even resize does not break iteration order.

But there can be problems. For example if we iterated over some key that was deleted after that and then inserted again into higher array index, it can eventually came up second time into iterator. But if we save highest non-empty index that was at iteration start, and monitor insert/delete operations (possible holes) in indices we've not iterated over yet, we will not need to throw ConcurrentModificationException in most cases.

I've done some benchmarks that proves my assumptions:

Reading existing key-value mapping has the same speed,
Adding new mapping and looking for missing key is significantly faster.

But there can be some problems in my testing strategy or benchmark can be somehow biased, so it's better to retest it with some independent benchmarks.

Hope it will help to make java collections faster since such approach (preallocated arrays instead of ...Entry objects) can be used to speedup other collections. In my scala version both HashMap and HashSet are implemented this way. It also stores primitive types like int in primitive arrays which saves a lot of memory. But on one hand scala has built-in boxed arrays, and on the other hand such array boxing is slow. It also has to store keys and values in different arrays, which gives extra random memory read and thus slower - when keys and values are in adjacent cells of the same array value will usually be pre-fetched when key is read.

Best, Alex Yakovlev -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20090604/9e14142e/attachment.html>

Previous message: Cant detect deadlock when thread reenters synchronization block in Object.wait()
Next message: Faster HashMap implementation
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the core-libs-dev mailing list