An Alternative Hashing Alternative (original) (raw)

Doug Lea dl at cs.oswego.edu
Wed Jun 6 16:51:10 UTC 2012


On 06/06/12 12:22, Doug Lea wrote:

I just posted the following to the concurrency-interest list. I'll send a follow-up on tie-ins to core-lib issues next.

The main issue is whether adding generic overflow-handling techniques to hash table classes (as used here for ConcurrentHashMap, but variations are applicable with a lot of effort to others) are a better solution than special-casing a better String hash method (i.e., the hash32 changes). My current sense is that they are:

On the other hand: Worst-case throughput using this approach for hostile Strings cannot be quite as good as using better, randomly-seeded hash functions. However, so long as the worst case is no worse than other ways to hostilely annoy busy servers etc, it is not clear to me that the hash32 approach is worthwhile. I don't have any factual basis for concluding this though, since I don't have full test setup for VU#903934. If anyone else does, I'd be interested in hearing about results.

One more semi-related follow-up while I am at it:

On 06/01/12 16:05, Eamonn McManus wrote:

There's also a performance problem in that accesses start becoming linear once there are more than 1<< 30 entries, but fixing that is substantially harder than just fixing size(), and ConcurrentHashMap already provides a better alternative for such huge maps.

Well, it does now in CHMV8. But before, even though it used segmented arrays, there are still only 32bits of hash code, so keys still collided.

-Doug



More information about the core-libs-dev mailing list