[Python-Dev] Status of the fix for the hash collision vulnerability (original) (raw)

"Martin v. Löwis" martin at v.loewis.de
Tue Jan 17 21:29:51 CET 2012

Previous message: [Python-Dev] Status of the fix for the hash collision vulnerability
Next message: [Python-Dev] Status of the fix for the hash collision vulnerability
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

I thought that the original problem was that with N insertions in the dictionary, by repeatedly inserting different keys generating the same hash value an attacker could arrange that the cost of finding an open slot is O(N), and thus the cost of N insertions is O(N^2).

If so, frequent resizing could make the attacker's problem much more difficult, as the distribution of secondary probes should change with each resize.

Not sure what you mean by "distribution of secondary probes".

Let H be the initial hash value, and let MASK be the current size of the dictionary. Then I(n), the sequence of dictionary indices being probed, is computed as

I(0) = H & MASK PERTURB(0) = H I(n+1) = (5*I(n) + 1 + PERTURB(n)) & MASK PERTURN(n+1) = PERTURB(n) >> 5

So if two objects O1 and O2 have the same hash value H, the sequence of probed indices is the same for any MASK value. It will be a different sequence, yes, but they will still collide on each and every slot.

This is the very nature of open addressing. If it wouldn't try all indices in the probe sequence, it may not be possible to perform the lookup for a key correctly.

Regards, Martin

Previous message: [Python-Dev] Status of the fix for the hash collision vulnerability
Next message: [Python-Dev] Status of the fix for the hash collision vulnerability
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list