[Python-Dev] Intricacies of calling eq (original) (raw)

Steven D'Aprano steve at pearwood.info
Wed Mar 19 02:09:49 CET 2014

Previous message: [Python-Dev] Intricacies of calling __eq__
Next message: [Python-Dev] Intricacies of calling __eq__
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Tue, Mar 18, 2014 at 04:42:29PM -0700, Kevin Modzelewski wrote:

My 2 cents: it feels like a slippery slope to start guaranteeing the number and ordering of calls to comparison functions -- for instance, doing that for the sort() function would lock in the sort implementation.

Although I agree with your conclusion, I'm not so sure I agree with the way you reach that conclusion. (Actually, I'm not even sure I agree with my own reasoning!)

The problem here isn't that Maciej wants to change the implementation of some method or function, like sort. The problem is that (as I understand it) Maciej wants a blessing to change the semantics of multiple Python statements.

Currently, the code:

if key in dict:
    return dict[key]

performs two dictionary lookups. If you read the code, you can see the two lookups: "key in dict" performs a lookup, and "dict[key]" performs a lookup. Sorry to belabour the obvious, but this gets to the core of the matter. Often this won't matter, but each lookup involves a call to hash and some variable number of calls to eq. (Again, as I understand it) Maciej wants to optimize away the second lookup, so that even if you write code like the above, what actually gets executed (modulo guards for modifications to dicts, multiple threads running, etc.) is very different, closer to this in semantics:

try:
    _temp = dict[key]
except KeyError:
    pass
else:
    return _temp

I'm not suggesting that PyPy actually will translate the code exactly like this, only that this will be the semantics. The critical point here is that in the Python code you write, there are two separate lookups, but in the code that is actually executed, there is only one.

Maciej, is my analysis of what you are doing correct?

Although I have tentatively said I think this is okay, it is a change in actual semantics of Python code: what you write is no longer what gets run. That makes this very different from changing the implementation of sort -- by analogy, its more like changing the semantics of

a = f(x) + f(x)

to only call f(x) once. I don't think you would call that an implementation detail, would you? Even if justified -- f(x) is a pure, deterministic function with no side-effects -- it would still be a change to the high-level behaviour of the code.

Since this proposal is limited only to built-in dicts in scenarios where they cannot be modified between the two lookups, I think that it will be okay. There's no language guarantee as to the number of times that eq will be called (although I would be surprised if hash isn't called twice). But I worry that I have missed something.

-- Steven

Previous message: [Python-Dev] Intricacies of calling __eq__
Next message: [Python-Dev] Intricacies of calling __eq__
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list