[Python-Dev] == on object tests identity in 3.x (original) (raw)
[Python-Dev] == on object tests identity in 3.x - list delegation to members?
Andreas Maier andreas.r.maier at gmx.de
Sun Jul 13 17:13:20 CEST 2014
- Previous message: [Python-Dev] == on object tests identity in 3.x
- Next message: [Python-Dev] == on object tests identity in 3.x - list delegation to members?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Am 11.07.2014 22:54, schrieb Ethan Furman:
On 07/11/2014 07:04 AM, Andreas Maier wrote:
Am 09.07.2014 03:48, schrieb Raymond Hettinger:
Personally, I see no need to make the same mistake by removing the identity-implies-equality rule from the built-in containers. There's no need to upset the apple cart for nearly zero benefit. Containers delegate the equal comparison on the container to their elements; they do not apply identity-based comparison to their elements. At least that is the externally visible behavior. If that were true, then [NaN] == [NaN] would be False, and it is not. Here is the externally visible behavior: Python 3.5.0a0 (default:34881ee3eec5, Jun 16 2014, 11:31:20) [GCC 4.7.3] on linux Type "help", "copyright", "credits" or "license" for more information. --> NaN = float('nan') --> NaN == NaN False --> [NaN] == [NaN] True
Ouch, that hurts ;-)
First, the delegation of sequence equality to element equality is not something I have come up with during my doc patch. It has always been in 5.9 Comparisons of the Language Reference (copied from Python 3.4):
"Tuples and lists are compared lexicographically using comparison of corresponding elements. This means that to compare equal, each element must compare equal and the two sequences must be of the same type and have the same length."
Second, if not by delegation to equality of its elements, how would the equality of sequences defined otherwise?
But your test is definitely worth having a closer look at. I have broadened the test somewhat and that brings up further questions. Here is the test output, and a discussion of the results (test program try_eq.py and its output test_eq.out are attached to issue #12067):
Test #1: Different equal int objects:
obj1: type=<class 'int'>, str=257, id=39305936 obj2: type=<class 'int'>, str=257, id=39306160
a) obj1 is obj2: False b) obj1 == obj2: True c) [obj1] == [obj2]: True d) {obj1:'v'} == {obj2:'v'}: True e) {'k':obj1} == {'k':obj2}: True f) obj1 == obj2: True
Discussion:
Case 1.c) can be interpreted that the list delegates its == to the == on its elements. It cannot be interpreted to delegate to identity comparison. That is consistent with how everyone (I hope ;-) would expect int objects to behave, or lists or dicts of them.
The motivation for case f) is explained further down, it has to do with caching.
Test #2: Same int object:
obj1: type=<class 'int'>, str=257, id=39305936 obj2: type=<class 'int'>, str=257, id=39305936
a) obj1 is obj2: True b) obj1 == obj2: True c) [obj1] == [obj2]: True d) {obj1:'v'} == {obj2:'v'}: True e) {'k':obj1} == {'k':obj2}: True f) obj1 == obj2: True
-> No surprises (I hope).
Test #3: Different equal float objects:
obj1: type=<class 'float'>, str=257.0, id=5734664 obj2: type=<class 'float'>, str=257.0, id=5734640
a) obj1 is obj2: False b) obj1 == obj2: True c) [obj1] == [obj2]: True d) {obj1:'v'} == {obj2:'v'}: True e) {'k':obj1} == {'k':obj2}: True f) obj1 == obj2: True
Discussion:
I added this test only to show that float NaN is a special case, and that this test for float objects - that are not NaN - behaves like test #1 for int objects.
Test #4: Same float object:
obj1: type=<class 'float'>, str=257.0, id=5734664 obj2: type=<class 'float'>, str=257.0, id=5734664
a) obj1 is obj2: True b) obj1 == obj2: True c) [obj1] == [obj2]: True d) {obj1:'v'} == {obj2:'v'}: True e) {'k':obj1} == {'k':obj2}: True f) obj1 == obj2: True
-> Same as test #2, hopefully no surprises.
Test #5: Different float NaN objects:
obj1: type=<class 'float'>, str=nan, id=5734784 obj2: type=<class 'float'>, str=nan, id=5734976
a) obj1 is obj2: False b) obj1 == obj2: False c) [obj1] == [obj2]: False d) {obj1:'v'} == {obj2:'v'}: False e) {'k':obj1} == {'k':obj2}: False f) obj1 == obj2: False
Discussion:
Here, the list behaves as I would expect under the rule that it delegates equality to its elements. Case c) allows that interpretation. However, an interpretation based on identity would also be possible.
Test #6: Same float NaN object:
obj1: type=<class 'float'>, str=nan, id=5734784 obj2: type=<class 'float'>, str=nan, id=5734784
a) obj1 is obj2: True b) obj1 == obj2: False c) [obj1] == [obj2]: True d) {obj1:'v'} == {obj2:'v'}: True e) {'k':obj1} == {'k':obj2}: True f) obj1 == obj2: False
Discussion (this is Ethan's example):
Case 6.b) shows the special behavior of float NaN that is documented: a float NaN object is the same as itself but unequal to itself.
Case 6.c) is the surprising case. It could be interpreted in two ways (at least that's what I found):
The comparison is based on identity of the float objects. But that is inconsistent with test #4. And why would the list special-case NaN comparison in such a way that it ends up being inconsistent with the special definition of NaN (outside of the list)?
The list does not always delegate to element equality, but attempts to optimize if the objects are the same (same identity). We will see later that that happens. Further, when comparing float NaNs of the same identity, the list implementation forgot to special-case NaNs. Which would be a bug, IMHO. I did not analyze the C implementation, so this is all speculation based upon external visible behavior.
Test #7: Different objects (with equal x) of class C (C.eq() implemented with equality of x, C.ne() returning NotImplemented):
obj1: type=<class '__main__.C'>, str=C(256), id=39406504 obj2: type=<class '__main__.C'>, str=C(256), id=39406616
a) obj1 is obj2: False C.eq(): self=39406504, other=39406616, returning True b) obj1 == obj2: True C.eq(): self=39406504, other=39406616, returning True c) [obj1] == [obj2]: True C.eq(): self=39406616, other=39406504, returning True d) {obj1:'v'} == {obj2:'v'}: True C.eq(): self=39406504, other=39406616, returning True e) {'k':obj1} == {'k':obj2}: True C.eq(): self=39406504, other=39406616, returning True f) obj1 == obj2: True
The eq() and ne() implementations each print a debug message. The ne() is only defined to verify that it is not invoked, and that the inherited default ne() does not chime in.
Discussion:
Here we see that the list equality comparison does invoke the element equality. However, the picture becomes more complex further down.
Test #8: Same object of class C (C.eq() implemented with equality of x, C.ne() returning NotImplemented):
obj1: type=<class '__main__.C'>, str=C(256), id=39406504 obj2: type=<class '__main__.C'>, str=C(256), id=39406504
a) obj1 is obj2: True C.eq(): self=39406504, other=39406504, returning True b) obj1 == obj2: True c) [obj1] == [obj2]: True d) {obj1:'v'} == {obj2:'v'}: True e) {'k':obj1} == {'k':obj2}: True C.eq(): self=39406504, other=39406504, returning True f) obj1 == obj2: True
Discussion:
The == on the class C objects in case 8.b) invokes eq(), even though the objects are the same object. This can be explained by the desire in Python that classes should be able not to be reflexive, if needed. Like float NaN, for example.
Now, the list equality in case 8.c) is interesting. The list equality does not invoke element equality. Even though object equality in case 8.b) did not assume reflexivity and invoked the eq() method, the list seems to assume reflexivity and seems to go by object identity.
The only other potential explanation (that I found) would be that some aspects of the comparison behavior are cached. That's why I added the cases f), which show that caching for comparison results does not happen (the eq() method is invoked again).
So we are back to discussing why element equality does not assume reflexivity, but list equality does. IMHO, that is another bug, or maybe the same one.
Test #9: Different objects (with equal x) of class D (D.eq() implemented with inequality of x, D.ne() returning NotImplemented):
obj1: type=<class '__main__.D'>, str=C(256), id=39407064 obj2: type=<class '__main__.D'>, str=C(256), id=39406952
a) obj1 is obj2: False D.eq(): self=39407064, other=39406952, returning False b) obj1 == obj2: False D.eq(): self=39407064, other=39406952, returning False c) [obj1] == [obj2]: False D.eq(): self=39406952, other=39407064, returning False d) {obj1:'v'} == {obj2:'v'}: False D.eq(): self=39407064, other=39406952, returning False e) {'k':obj1} == {'k':obj2}: False D.eq(): self=39407064, other=39406952, returning False f) obj1 == obj2: False
Discussion:
Class D implements eq() by != on the data attribute. This test does not really show any surprises, and is consistent with the theory that list comparison delegates to element comparison. This is really just a preparation for the next test, that uses the same object of this class.
Test #10: Same object of class D (D.eq() implemented with inequality of x, D.ne() returning NotImplemented):
obj1: type=<class '__main__.D'>, str=C(256), id=39407064 obj2: type=<class '__main__.D'>, str=C(256), id=39407064
a) obj1 is obj2: True D.eq(): self=39407064, other=39407064, returning False b) obj1 == obj2: False c) [obj1] == [obj2]: True d) {obj1:'v'} == {obj2:'v'}: True e) {'k':obj1} == {'k':obj2}: True D.eq(): self=39407064, other=39407064, returning False f) obj1 == obj2: False
Discussion:
The inequality-based implementation of eq() explains case 10.b). It is surprising (to me) that the list comparison in case 10.c) returns True. If one compares that to case 9.c), one could believe that the identities of the objects are used for both cases. But why would the list not respect the result of eq() if it is implemented?
This behavior seems at least to be consistent with surprise of case 6.c)
In order to not just rely on the external behavior, I started digging into the C implementation. For list equality comparison, I started at list_richcompare() which uses PyObject_RichCompareBool(), which shortcuts its result based on identity comparison, and thus enforces reflexitivity.
The comment on line 714 in object.c in PyObject_RichCompareBool() also confirms that:
/* Quick result when objects are the same. Guarantees that identity implies equality. */
IMHO, we need to discuss whether we are serious with the direction that was claimed earlier in this thread, that reflexivity (i.e. identity implies equality) should be decided upon by the classes and not by the Python language. As I see it, we have some pieces of code that enforce reflexivity, and some that don't.
Andy
- Previous message: [Python-Dev] == on object tests identity in 3.x
- Next message: [Python-Dev] == on object tests identity in 3.x - list delegation to members?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]