Review Request CR#7118743 : Alternative Hashing for String with Hash-based Maps (original) (raw)

Ulf Zibis Ulf.Zibis at gmx.de
Sat May 26 01:43:40 UTC 2012


Am 23.05.2012 07:16, schrieb Mike Duigou:

Webrevs for the Java 7u6 and 8 changes are available for download at [2] and [3] for your review. There are some important differences between the Java 7 and 8 implementations of this enhancement. Most specifically in the Java 8 implementation alternative string hashing is always enabled--no threshold is used for enablement and alternative hashing cannot be disabled. (The Java 8 implementation completely ignores the jdk.map.althashing.threshold system property). The Java 8 implementation is also subject to additional refinement as Java 8 develops.

To me it seems, that computing the murmur hash is more expensive, especially on short strings, than the old hash algorithm. So I think, a developer should have an influence on the hash algorithm to be used by a hash map, especially for strings, see http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6812862.

So I propose a public accessible field in String: public int customHash;

Then the HashMap should have an overrideable hash method: protected int hash(K key) { h = key.hashCode(); h ^= (h >>> 20) ^ (h >>> 12); return h ^ (h >>> 7) ^ (h >>> 4); }

Then we should have a new class: public class MurmurHashMap extends HashMap<String,V> { protected final int hash(String key) { int h = key.customHash; if (h == 0) { // harmless data race on customHash here. h = Hashing.murmur3_32(HASHING_SEED, key); // ensure result is not zero to avoid recalculating h = (h != 0) ? h : 1; key.customHash = h; } return hashMask ^ h; }

In class Hashing we need: public static int murmur3_32(int seed, String key) { int h1 = seed; int count = key.length();

     // body
     for (int off = 0; count >= 2;) {
         int k1 = (key.charAt(off++) & 0xFFFF) | key.charAt(off++) << 16);

// alternative: for (int off = 0; count >= 4;) { int k1 = (key.charAt(off) & 0x0FF) | (key.charAt(off + 1) & 0x0FF) << 8 | (key.charAt(off + 2) & 0x0FF) << 16 | key.charAt(off + 3) << 24; ... } ... }

For other use cases see: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6812862

-Ulf



More information about the core-libs-dev mailing list