[Python-Dev] Re: Re: lists v. tuples (original) (raw)

Andrew Koenig ark@research.att.com
16 Mar 2003 13:59:30 -0500


Tim> [Guido]

Yes. And I'm still hoping to remove cmp; there should be only one way to overload comparisons.

Tim> As long as we're going to break everyone's code, list.sort(f) Tim> should also be redefined then so that f returns a Boolean, f(a, Tim> b) meaning a is less than b.

I don't think it's necessary to break code in order to accommodate that change, as long as you're willing to tolerate one extra comparison per call to sort, plus a small amount of additional overhead.

As I understand it, the problem is to distinguish between a function that returns negative, zero, or positive, depending on the result of the comparison, and a function that returns true or false. So if we had a way to determine efficiently which kind of function the user supplied, we could maintain compatibility.

Imagine, then, that we have a function f, and we want to figure out which kind of function it is. Assume, furthermore, that the only kind of commparisons we want to perform is to determine whether a < b for various values of a and b.

Note first that whenever f(a, b) returns 0, we don't care which kind of function f is, because a < b will be false in either case. So we allow our sort algorithm to run until the first time a call to f(a, b) returns a nonzero value.

Now we can determine what kind of function f is by calling f(b, a). If f(b, a) is zero, then f is a boolean predicate. If f(b, a) is nonzero, then f returns negative/zero/positive -- and, incidentally, f(b, a) had better have the opposite sign from f(a, b).

I understand that there is some overhead involved in storing the information about which kind of comparison it is, and testing it on each comparison. I suspect, however, that that overhead can be made small compared to the overhead involved in calling the comparison function itself.

-- Andrew Koenig, ark@research.att.com, http://www.research.att.com/info/ark