[Python-Dev] PyObject_RichCompareBool identity shortcut (original) (raw)

Nick Coghlan ncoghlan at gmail.com
Fri Apr 29 01:40:40 CEST 2011


On Fri, Apr 29, 2011 at 9:13 AM, Guido van Rossum <guido at python.org> wrote:

I hadn't really thought about it that way before this discussion - it is the identity checking behaviour of the builtin containers that lets us sensibly handle cases like sets of NumPy arrays. But do they? For non-empty arrays, eq will always return something that is considered true, so any hash collisions will cause false positives. And look at this simple example:

class C(list): ...   def eq(self, other): ...     if isinstance(other, C): ...       return [x == y for x, y in zip(self, other)] ... a = C([1,2,3]) b = C([2,1,3]) a == b [False, False, True] x = [a, a] b in x True

Hmm, true. And things like count() and index() would still be thoroughly broken for sequences. OK, so scratch that idea - there's simply no sane way to handle such objects without using an identity-based container that ignores equality definitions altogether.

Pondering the NaN problem further, I think we can relatively easily argue that reflexive behaviour at the object level fits within the scope of IEEE754.

  1. IEEE754 is a value-based system, with a finite number of distinct NaN payloads
  2. Python is an object-based system. In addition to their payload, NaN objects are further distinguished by their identity (infinite in theory, in practice limited by available memory).
  3. We can still technically be conformant with IEEE754 even if we say that a given NaN object is equivalent to itself, but not to other NaN objects with the same payload.

Unfortunately, this still doesn't play well with serialisation, which assumes that the identity of float objects doesn't matter:

import pickle nan = float('nan') x = [nan, nan] x[0] is x[1] True y = pickle.loads(pickle.dumps(x)) y [nan, nan] y[0] is y[1] False

Contrast that with the handling of lists, where identity is known to be significant:

x = [[]]*2 x[0] is x[1] True y = pickle.loads(pickle.dumps(x)) y [[], []] y[0] is y[1] True

I'd say I've definitely come around to being +0 on the idea of making the float() and decimal.Decimal() eq definitions reflexive, but doing so does have implications when it comes to the ability to accurately save and restore application state. It isn't as simple as just adding "if self is other: return True" to the respective eq implementations.

Regards, Nick.

-- Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia



More information about the Python-Dev mailing list