[Python-Dev] Hash collision security issue (now public) (original) (raw)

martin at v.loewis.de martin at v.loewis.de
Sun Jan 1 04:36:37 CET 2012


(Well, technically, you could use trees or some other O log n data structure as a fallback once you have too many collisions, for some value of "too many". Seems a bit wasteful for the purpose, though.)

I don't think that would be wasteful. You wouldn't just use the tree for the case of too many collisions, but for any collision. You might special-case the case of a single key, i.e. start using the tree only if there is a collision.

The issue is not the effort, but the need to support ordering if you want to use trees. So you might restrict this to dicts that have only str keys (which in practice should never have any collision, unless it's a deliberate setup).

I'd use the tagged-pointer trick to determine whether a key is an object pointer (tag 0) or an AVL tree (tag 1). So in the common case of interned strings, the comparison for pointer equality (which is the normal case if the keys are interned) will succeed quickly; if pointer comparison fails, check the tag bit.

Regards, Martin



More information about the Python-Dev mailing list