[Python-Dev] Hashing proposal: change only string-only dicts (original) (raw)
Victor Stinner victor.stinner at haypocalc.com
Tue Jan 17 23:06:47 CET 2012
- Previous message: [Python-Dev] Hashing proposal: change only string-only dicts
- Next message: [Python-Dev] Hashing proposal: change only string-only dicts
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
2012/1/17 "Martin v. Löwis" <martin at v.loewis.de>:
I'd like to propose a different approach to seeding the string hashes: only do so for dictionaries involving only strings, and leave the tphash slot of strings unchanged.
The real problem is in dict (or any structure using an hash table), so if it is possible, it would also prefer to fix the problem directly in dict.
There are several variants of that approach wrt. caching of the hash 1. add an additional field to all string objects, to cache the second hash value. a) variant: in 3.3, drop the extra field, and declare that hashes may change across runs 2. only cache the dict-hash, recomputing the public hash each time 3. on a per-string choice, cache either the dict-hash or the public hash, depending on which one gets computed first, and recompute the other one every time it's needed.
There is a simpler solution:
bucket_index = (hash(str) ^ secret) & DICT_MASK.
Remark: set must also be fixed.
Victor
- Previous message: [Python-Dev] Hashing proposal: change only string-only dicts
- Next message: [Python-Dev] Hashing proposal: change only string-only dicts
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]