An Alternative Hashing Alternative (original) (raw)

Doug Lea dl at cs.oswego.edu
Wed Jun 6 16:51:10 UTC 2012

Previous message: An Alternative Hashing Alternative
Next message: An Alternative Hashing Alternative
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 06/06/12 12:22, Doug Lea wrote:

I just posted the following to the concurrency-interest list. I'll send a follow-up on tie-ins to core-lib issues next.

The main issue is whether adding generic overflow-handling techniques to hash table classes (as used here for ConcurrentHashMap, but variations are applicable with a lot of effort to others) are a better solution than special-casing a better String hash method (i.e., the hash32 changes). My current sense is that they are:

They also address the huge-map problem (although not as well as would be the case if we had 64bit arrays and hash codes, but that is not happening anytime soon).
They work for types other than String (although still only for Comparables), and improve performance when hashes are merely poorly distributed, not hostilely identical.
They avoid the need for a cascading series of related OpenJDK updates.
Normal-case (non-hostile etc) throughput is better.

On the other hand: Worst-case throughput using this approach for hostile Strings cannot be quite as good as using better, randomly-seeded hash functions. However, so long as the worst case is no worse than other ways to hostilely annoy busy servers etc, it is not clear to me that the hash32 approach is worthwhile. I don't have any factual basis for concluding this though, since I don't have full test setup for VU#903934. If anyone else does, I'd be interested in hearing about results.

One more semi-related follow-up while I am at it:

On 06/01/12 16:05, Eamonn McManus wrote:

There's also a performance problem in that accesses start becoming linear once there are more than 1<< 30 entries, but fixing that is substantially harder than just fixing size(), and ConcurrentHashMap already provides a better alternative for such huge maps.

Well, it does now in CHMV8. But before, even though it used segmented arrays, there are still only 32bits of hash code, so keys still collided.

-Doug

Previous message: An Alternative Hashing Alternative
Next message: An Alternative Hashing Alternative
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the core-libs-dev mailing list