[Python-Dev] String hash function multiplier (original) (raw)

Raymond Hettinger python at rcn.com
Wed Apr 14 12:05:50 EDT 2004


[Raymond] > Does anyone have any issues with changing the hash multiplier for the > string and Unicode hash functions?

[Tim]

Don't touch it unless you can prove major benefits -- it's a remarkable fact of life that the current multiplier hasn't resulted in any real-life (but non-contrived) pathological cases.

Will leave it alone.

Perhaps you think shifts and adds are faster? I wouldn't -- the imul instruction on modern Pentiums is very fast.

On the P4, the documented latency went up from 4 cycles to 14 cycles while shifts and adds went down to 0.5 cycles and 1 cycle respectively. Timings confirm the result.

It looks like the best bet is to try to speedup the code without changing the multiplier. Intel's software optimization cookbook recommends a partial unrolling and elimination of data dependencies so that a second multiply can start 4 cycles after the previous one started. If practice bears out the theory, the timings could show a three or fourfold speedup without changing the multiplier.

(read Knuth).

Of course, I already have :-)

The right thing to compare Python's string hash to is "the standard" Fowler-Noll-Vo string hash

Ditto.

Raymond

################################################################# ################################################################# ################################################################# ##### ##### ##### ################################################################# ################################################################# #################################################################



More information about the Python-Dev mailing list