[Python-Dev] Status of the fix for the hash collision vulnerability (original) (raw)

Hrvoje Niksic hrvoje.niksic at avl.com
Wed Jan 18 11:15:49 CET 2012


On 01/17/2012 09:29 PM, "Martin v. Löwis" wrote:

I(0) = H& MASK PERTURB(0) = H I(n+1) = (5*I(n) + 1 + PERTURB(n))& MASK PERTURN(n+1) = PERTURB(n)>> 5

So if two objects O1 and O2 have the same hash value H, the sequence of probed indices is the same for any MASK value. It will be a different sequence, yes, but they will still collide on each and every slot. This is the very nature of open addressing.

Open addressing can still deploy a collision resolution mechanism without this property. For example, double hashing uses a different hash function (applied to the key) to calculate PERTURB(0). To defeat it, the attacker would have to produce keys that hash the same using both hash functions.

Double hashing is not a good general solution for Python dicts because it complicates the interface of hash tables that support arbitrary keys. Still, it could be considered for dicts with known key types (built-ins could hardcode the alternative hash function) or for SafeDicts, if they are still considered.

Hrvoje



More information about the Python-Dev mailing list