[Python-Dev] Hashing proposal: change only string-only dicts (original) (raw)

"Martin v. Löwis" martin at v.loewis.de
Wed Jan 18 18:52:23 CET 2012


Am 18.01.2012 13:30, schrieb Barry Warsaw:

On Jan 18, 2012, at 08:19 AM, Martin v. Löwis wrote:

My concern is not about breaking doctests: this proposal will also break them. My concern is about applications that assume that hash(s) is stable across runs, and we do have reports that it will break applications. I am a proponent of doctests, and thus use them heavily. I can tell you that the issue of dict hashing (non-)order has been well known for years and I have convenience functions in my own doctests to sort and print dict elements.

Indeed. So that breakage may actually be less than people expect.

As for cases that still rely on dict order: none of the proposed solutions preserve full compatibility in dict order. The only solution (not actually proposed so far) is to add an AVL tree into the hash table, to track keys that collide on hash values (rather than hash slots). Such a tree would be only used if there is an actual collision, which, in practical dict usage, never occurs.

I've been seriously considering implementing a balanced tree inside the dict (again for string-only dicts, as ordering can't be guaranteed otherwise). However, this would be a lot of code for a security fix. It would solve the issue for good, though.

Regards, Martin



More information about the Python-Dev mailing list