[Python-Dev] Status of the fix for the hash collision vulnerability (original) (raw)

Guido van Rossum guido at python.org
Sun Jan 15 17:10:54 CET 2012


On Sun, Jan 15, 2012 at 6:30 AM, Stefan Behnel <stefan_ml at behnel.de> wrote:

Terry Reedy, 14.01.2012 06:43: > On 1/13/2012 8:58 PM, Gregory P. Smith wrote: > >> It is perfectly okay to break existing users who had anything depending >> on ordering of internal hash tables. Their code was already broken. > > Given that the doc says "Return the hash value of the object", I do not > think we should be so hard-nosed. The above clearly implies that there is > such a thing as the Python hash value for an object. And indeed, that has > been true across many versions. If we had written "Return a hash value for > the object, which can vary from run to run", the case would be different.

Just a side note, but I don't think hash() is the right place to document this.

You mean we shouldn't document that the hash() of a string will vary per run?

Hashing is a protocol in Python, just like indexing or iteration. Nothing keeps an object from changing its hash value due to modification,

Eh? There's a huge body of cultural awareness that only immutable objects should define a hash, implying that the hash remains constant during the object's lifetime.

and that would even be valid in the face of the usual dict lookup invariants if changes are only applied while the object is not referenced by any dict.

And how would you know it isn't?

So the guarantees do not depend on the function hash() and may be even weaker than your above statement.

There are no actual guarantees for hash(), but lots of rules for well-behaved hashes.

-- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20120115/e2ef01e7/attachment.html>



More information about the Python-Dev mailing list