[Python-3000] Performance Notes - new hash algorithm (original) (raw)
Guido van Rossum guido at python.org
Fri Sep 7 22:53:45 CEST 2007
- Previous message: [Python-3000] Performance Notes - new hash algorithm
- Next message: [Python-3000] Performance Notes - new hash algorithm
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
I'd like Tim Peters's input on this before we change it. I seem to recall that there's an aspect of non-randomness to the existing hash function that's important when you hash many closely related strings, e.g. "0001", "0002", "0003", etc., into a dictionary. Though it's been so long that I may misremember this, and perhaps it was related to the dictionary implementation.
In any case we need to see the code as a patch, of course.
On 9/7/07, Gregory P. Smith <greg at krypto.org> wrote:
On 9/4/07, Thomas Hunger <hto at arcor.de> wrote: > > Hello, > > I don't know much about python internals, so the following might be > bogus: > > I replaced unicodehash and stringhash with the hash function from > here: http://www.azillionmonkeys.com/qed/hash.html. > > Then I ran the following micro-benchmark : > > $ time ./python bench.py > > where bech.py is: > > f = dict((line, nr) for nr, line > in enumerate(open('/usr/share/dict/words', > encoding='latin1').readlines())) > > Python3k original hash: real 0m2.210s > new hash: real 0m1.842s > > So maybe this is an interesting hash function? > > Tom
Sounds like a great idea to me. Can you submit it as a patch? We should run some more realistic perf tests and profiles but I imagine the impact will only be good. -gps
Python-3000 mailing list Python-3000 at python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org
-- --Guido van Rossum (home page: http://www.python.org/~guido/)
- Previous message: [Python-3000] Performance Notes - new hash algorithm
- Next message: [Python-3000] Performance Notes - new hash algorithm
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]