Message 151970 - Python tracker (original) (raw)

Le mercredi 25 janvier 2012 à 19:19 +0000, Dave Malcolm a écrit :

Dave Malcolm <dmalcolm@redhat.com> added the comment:

On Wed, 2012-01-25 at 18:05 +0000, Antoine Pitrou wrote:

Antoine Pitrou <pitrou@free.fr> added the comment:

I'm attaching a revised version of the patch that should fix the above issue: hybrid-approach-dmalcolm-2012-01-25-002.patch

It looks like that approach will break any non-builtin type (in either C or Python) which can compare equal to bytes or str objects. If that's the case, then I think the likelihood of acceptance is close to zero.

How?

This kind of type, for example:

class C: def hash(self): return hash(self._real_str)

def __eq__(self, other):
    if isinstance(other, C):
       other = other._real_str
    return self._real_str == other

If I'm not mistaken, looking up C("abc") will stop matching "abc" when there are too many collisions in one of your dicts.

Also, the level of complication is far higher than in any other of the proposed approaches so far (I mean those with patches), which isn't really a good thing.

So would I. I want something I can backport, though.

Well, your approach sounds like it subtly and unpredictably changes the behaviour of dicts when there are too many collisions, so I'm not sure it's a good idea to backport it, either.

If we don't want to backport full hash randomization, I think I much prefer raising a BaseException when there are too many collisions, rather than this kind of (excessively) sophisticated workaround. You are changing a fundamental datatype in a rather complicated way.