[Python-Dev] cmp(x,x) (original) (raw)

Michael Chermside mcherm at mcherm.com
Tue May 25 13:07:31 EDT 2004

Previous message: [Python-Dev] cmp(x,x)
Next message: [Python-Dev] cmp(x,x)
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Raymond writes:

The code in question is in PyRichCompareBool() which always just returns True or False. That routine is called by list.contains() and many other functions that expect a yes or no answer.

The regular rich comparison function, PyRichCompare() is the same as it always was and can still return arrays of bools, complex numbers, or anything at all.

All right, I took a look at exactly where PyObject_RichCompareBool is called. Here is a COMPLETE list of all uses in 2.3.3 (that's what I had lying around):

In listobject.c, to test containment in a list (ie: "x in [x]").
In tupleobject.c to perform containment tests on tuples (ie: "x in (a,b,c)").
In arraymodule.c, in array_contains to test containment in an array.
In listobject.c, in listindex(), listremove(), and listcount() to find the first occurance, delete the first occurance, or count the number of occurances of an object in the list.
In arraymodule.c, in array_index(), array_remove(), and array_count() to find the first occurance, delete the first occurance, or count the number of occurances of an object in the array.
In abstract.c, in _PySequence_IterSearch() to test containment of, find the first occurance of, or count the occurances of some object.
In listobject.c, in list_richcompare(), to skip past identical list elements when comparing lists.
In tupleobject.c, in tuplerichcompare() to skip past identical tuple elements when comparing tuples.
In arraymodule.c, in array_richcompare, to skip past identical array elements when comparing arrays.
In listobject.c, used in sorting if the user does not provide a user-defined comparison function.
In bltinmodule.c, in min_max(), to compare two objects within the min and max functions.
In iterobject.c, to test whether an object is the sentinal object.
In dictobject.c, in lookdict(), to test whether an key in the dict matches the key being looked up.
In dictobject.c, in characterize() ... I'm not quite sure what this is doing.
In dictobject.c when comparing two dicts for equality.
In typeobject.c, in method_is_overloaded, to test whether the methods defined in a class and some subclass are actually the same object.
In ceval.c, for matching keyword arguments to code object varnames in PyEval_EvalCodeEx.
In ceval.c in _PyEval_SliceIndex() to compare a bound to the number 0L.
And finally, it is part of the published API so it could appear anywhere in extension modules... but I would guess that people use PyObject_RichCompare normally and only call PyObject_RichCompareBool if they really want a boolean.

What I learned by writing this list is that Raymond is right... being able to assume that "x is x" implies "x == x" is very useful for implementors. And if that assumption is made in PyObject_RichCompareBool() and NOT in PyObject_RichCompare(), then it hardly ever needs to conflict with the desire for user-defined comparison functions to return peculiar things (like Numeric arrays, values that are artifically not equal (like NANs or a poorly-designed user-implemented Max object).

Okay, I know this is too much analysis, but I've started now, so I'm going to go through with it and send off this email. Here's my thought: Raymond has convinced me that we want to make this assumption in the interpreter. I want the rules for when a user-defined comparison function (cmp, eq, or ne) is invoked and when it isn't. Maybe we can achieve both.

Here's that list above, grouped by function:

Test Containment in a sequence, also index(), count(), and remove() on sequences.
Comparing sequences or dicts to each other.
Sorting lists when no user-defined comparison function is given.
in min() and max()
checking for sentinal in iteration
looking up dict keys
characterize() in dictobject (what's this for?)
Checking if a method is the same to tell if it's overloaded.
Matching keyword arguments to code object varnames in EvalCode
Compare a slice bound to 0L.
Extension modules that call PyObject_RichCompareBool instead of PyObject_RichCompare.

This is the list of all places where RichCompareBool is called instead of RichCompare, and thus of all places where a user-defined comparison function might (surprisingly) not be called. Some are not relevent (eg: #9 compares only strings, a built-in type). For some, I can concoct artifical examples where someone would care (eg: #5: creating a sentinal for iter() that tries to be so VERY clever that it allows the sentinal object itself to occur in the sequence sometimes without stopping the iteration; or #3: trying to understand the behavior of timsort by creating objects that log comparisons rather than by reading the code). But these feel artificial to me. The big ones seem to be #1 and #2.

I'd even say that it makes "intuitive sense" to me somehow that containment, index(), count() and remove() all act as if identity implied equality (although I can think of only slightly absurd examples where a user might try to alter this behavior. But when comparing sequences or dicts to each other (something people do a LOT), I would intuitively expect that any contained objects with customized comparison methods would have those methods invoked.

Okay... I've talked myself into a corner now. Raymond has convinced me that my original idea was misguided, and I've looked closely at the problem, but I don't see an "obvious" solution. I'm tending to think it's best to put the test for identity in PyObject_RichCompareBool, but then how do we explain (in simple terms) when user-defined comparison methods are invoked and when they're not necessarily?

Well, even without an answer, I'd better send this email off and get back to my own work. I'm not sure if I've gotten anywhere with this or not.

-- Michael Chermside

Previous message: [Python-Dev] cmp(x,x)
Next message: [Python-Dev] cmp(x,x)
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list