[Python-Dev] cmp(x,x) (original) (raw)

Michael Chermside mcherm at mcherm.com
Tue May 25 13:07:31 EDT 2004


Raymond writes:

The code in question is in PyRichCompareBool() which always just returns True or False. That routine is called by list.contains() and many other functions that expect a yes or no answer.

The regular rich comparison function, PyRichCompare() is the same as it always was and can still return arrays of bools, complex numbers, or anything at all.

All right, I took a look at exactly where PyObject_RichCompareBool is called. Here is a COMPLETE list of all uses in 2.3.3 (that's what I had lying around):

What I learned by writing this list is that Raymond is right... being able to assume that "x is x" implies "x == x" is very useful for implementors. And if that assumption is made in PyObject_RichCompareBool() and NOT in PyObject_RichCompare(), then it hardly ever needs to conflict with the desire for user-defined comparison functions to return peculiar things (like Numeric arrays, values that are artifically not equal (like NANs or a poorly-designed user-implemented Max object).

Okay, I know this is too much analysis, but I've started now, so I'm going to go through with it and send off this email. Here's my thought: Raymond has convinced me that we want to make this assumption in the interpreter. I want the rules for when a user-defined comparison function (cmp, eq, or ne) is invoked and when it isn't. Maybe we can achieve both.

Here's that list above, grouped by function:

  1. Test Containment in a sequence, also index(), count(), and remove() on sequences.

  2. Comparing sequences or dicts to each other.

  3. Sorting lists when no user-defined comparison function is given.

  4. in min() and max()

  5. checking for sentinal in iteration

  6. looking up dict keys

  7. characterize() in dictobject (what's this for?)

  8. Checking if a method is the same to tell if it's overloaded.

  9. Matching keyword arguments to code object varnames in EvalCode

  10. Compare a slice bound to 0L.

  11. Extension modules that call PyObject_RichCompareBool instead of PyObject_RichCompare.

This is the list of all places where RichCompareBool is called instead of RichCompare, and thus of all places where a user-defined comparison function might (surprisingly) not be called. Some are not relevent (eg: #9 compares only strings, a built-in type). For some, I can concoct artifical examples where someone would care (eg: #5: creating a sentinal for iter() that tries to be so VERY clever that it allows the sentinal object itself to occur in the sequence sometimes without stopping the iteration; or #3: trying to understand the behavior of timsort by creating objects that log comparisons rather than by reading the code). But these feel artificial to me. The big ones seem to be #1 and #2.

I'd even say that it makes "intuitive sense" to me somehow that containment, index(), count() and remove() all act as if identity implied equality (although I can think of only slightly absurd examples where a user might try to alter this behavior. But when comparing sequences or dicts to each other (something people do a LOT), I would intuitively expect that any contained objects with customized comparison methods would have those methods invoked.

Okay... I've talked myself into a corner now. Raymond has convinced me that my original idea was misguided, and I've looked closely at the problem, but I don't see an "obvious" solution. I'm tending to think it's best to put the test for identity in PyObject_RichCompareBool, but then how do we explain (in simple terms) when user-defined comparison methods are invoked and when they're not necessarily?

Well, even without an answer, I'd better send this email off and get back to my own work. I'm not sure if I've gotten anywhere with this or not.

-- Michael Chermside



More information about the Python-Dev mailing list