Issue 2235: eq / hash check doesn't take inheritance into account (original) (raw)

Created on 2008-03-04 19:49 by jek, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (37)

msg63258 - (view)

Author: jason kirtland (jek)

Date: 2008-03-04 19:49

In 2.6a, seems like the hash implementation and eq must be defined together, in the same class. See also #1549. Ensuring that a hash implementation isn't being pulled from a builtin type is probably a sufficient check...?

class Base(object): ... def init(self, name): ... self.name = name ... def eq(self, other): ... return self.name == other.name ... def hash(self): ... return hash(self.name) ... class Extended(Base): ... def eq(self, other): ... print "trace: eq called" ... return Base.eq(self, other) ... hash(Base('b1')) 603887253 hash(Extended('e1')) Traceback (most recent call last): File "", line 1, in TypeError: unhashable type: 'Extended'

msg63263 - (view)

Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer)

Date: 2008-03-04 21:56

This issue is similar to . Note that only new-style classes are concerned.

I think the change was intentional. Guido, do you confirm?

However there should be some documentation about it, a NEWS entry or an item in the "porting to 2.6" section.

The initial modification appeared in the py3k branch (r51454): """ Change the way hash is inherited; when eq or cmp is overridden but hash is not, set hash explicitly to None (and tp_hash to NULL). """

msg63264 - (view)

Author: Raymond Hettinger (rhettinger) * (Python committer)

Date: 2008-03-04 22:15

Wouldn't you expect this sort of thing to break code? Does it meet the criteria for backporting to 2.6?

msg63265 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2008-03-04 22:17

The py3k feature was intentional.

I'm not sure who did the backport (could've been me, long ago). I think the backport should be replaced with a warning that is only issued when -3 is given.

msg63269 - (view)

Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer)

Date: 2008-03-05 00:00

The attached patch reverts r59576 and the part of r59106 about the tp_hash slot.

It also adds the py3k warning:: type defines eq but not hash, and will not be hashable in 3.x printed when calling hash() on such an object.

msg63270 - (view)

Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer)

Date: 2008-03-05 00:05

I noticed that my patch uses Py_TYPE(self)->tp_hash, whereas normal processing of slot_tp_hash() uses lookup_method(self,"hash",hash_str).

I am almost sure that both expressions return the same value. Is this correct? Py_TYPE(self)->tp_hash seems much faster.

msg63271 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2008-03-05 00:12

I noticed that my patch uses Py_TYPE(self)->tp_hash, whereas normal processing of slot_tp_hash() uses lookup_method(self,"hash",hash_str).

I am almost sure that both expressions return the same value. Is this correct? Py_TYPE(self)->tp_hash seems much faster.

HERE BE DRAGONS

There are cases where Py_TYPE(self)->tp_hash is the function currently executing (some wrapper(s) in typeobject.c). OTOH, lookup_method(...) finds the Python code in the class __dict__s along the MRO.

msg63272 - (view)

Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer)

Date: 2008-03-05 00:48

Ah, I remember that it took me some time to understand the boundaries between a slot and the corresponding special method.

Here is another version of the patch, which does not test tp_hash while we are currently running the tp_hash function... I also added the warning for old-style classes.

msg65297 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2008-04-10 16:46

Hi Amaury, thanks for your work on this. I applied your patch and looked at the code but there seem to be some issues left still.

(1) I don't think you need to add a warning to classic classes that define eq without hash -- these aren't hashable and never were. The problem is purely with new classes AFAICT.

(2) I can't seen to trigger the warning for new classes (and yes, I specified -3 on the command line :-):

class C(object): ... def eq(self, other): return False ... hash(C()) -134976284

(3) test_Hashable in test_collections.py fails. This is because the Hashable class believes that list, set, and dict are hashable. Something odd is going on:

list.hash([]) -135100116

but

hash([]) Traceback (most recent call last): File "", line 1, in TypeError: unhashable type: 'list'

Note that in 2.5, both raise TypeError.

msg65298 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2008-04-10 16:47

Let's try to have this resolved before the next release.

Also let's try not to accidentally merge the 2.6 changes related to this issue into 3.0. :-)

msg65299 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2008-04-10 16:51

Note: some more tests also fail:

test_descr, test_quopri, test_set

msg65329 - (view)

Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer)

Date: 2008-04-10 23:44

Here is a new patch, all tests pass.

I had to add two TPSLOT entries to the slodefs list; can you please check that this is correct (I wanted to reset tp_hash to NULL when eq or cmp are defined).

msg66399 - (view)

Author: Barry A. Warsaw (barry) * (Python committer)

Date: 2008-05-08 03:43

Guido, can you comment on Amaury's latest patch? I'm going to bump this back to critical so as not to hold up 2.6 alpha, but it should be marked as a release blocker for the first beta.

msg68792 - (view)

Author: Glyph Lefkowitz (glyph) (Python triager)

Date: 2008-06-26 16:50

As barry said, this should have been a release blocker for the first beta... ;-)

msg68795 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2008-06-26 17:06

I'm going to have to ask someone else to look at this code. I am too busy with too many things to be able to take on a detailed code review.

msg68828 - (view)

Author: Alyssa Coghlan (ncoghlan) * (Python committer)

Date: 2008-06-27 13:23

Amaury's patch doesn't currently remove the new-in-2.6 code that turns "tp_hash = NULL" at the C level into "hash = None" at the Python level. I suspect that will prove to be a problem (and may be the cause of Django's woes, if I remember Glyph's python-dev posts correctly). I also don't believe the new TPSLOT entries should be necessary - we should be able to just revert all of this tp_hash related code to match the 2.5 implementation, and then see if we can find a way to emit the Py3k warning when cmp or eq are overridden without overriding hash without breaking any existing code (if we can't, we may have to figure out a way to get 2to3 to check for it).

P.S. you'd think I would have learnt by now not to try to make sense of typeobject.c when I'm tired ;)

msg68839 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2008-06-27 18:05

Thanks for giving this some time. I think that backwards compatibility should be a higher priority than being able to issue -3 warnings -- if the warnings can't be generated, too bad, we'll just have to document it. (I don't think that checking things at class scope is an easy task in 2to3, though I may be underestimating it.)

So, concluding, insofar as the proposal is to revert 2.6 to the 2.5 semantics where it comes to eq and hash, I'm okay with that if there's no other way to maintain backwards compatibility, even though my preference would be to still flag this when -3 is specified. (There may be similar backwards compatibility issues caused by stricted arg checking in new and init -- the same remarks apply there.)

msg68866 - (view)

Author: Alyssa Coghlan (ncoghlan) * (Python committer)

Date: 2008-06-28 00:56

The new_/init stuff showed up on my diff as well (obviously) - that seemed to be doing a pretty good job of maintaining backwards compatibility.

I'll dig into this further today and tomorrow - I'll probably start by reverting the relevant code back to exactly what is in 2.5, and then see how far I can get with adding the warning code back in while still keeping jek's test case working correctly.

msg68937 - (view)

Author: Alyssa Coghlan (ncoghlan) * (Python committer)

Date: 2008-06-29 08:34

Well, I think I figured out why the hash changes were backported from Py3k: without them, the existence of object.hash makes the Hashable ABC completely useless (every newstyle class meets it).

I see two options here. Option 1 is to revert the changes so hash behave entirely as it did in 2.5 and also delete collections.Hashable from 2.6 (which may break other backported Py3k items). Doesn't seem like a particularly good idea.

Option 2 is what I have implemented locally (and will be uploading soon): changing typeobject.c to allow Py_None to be stored in type method slots, and have it convert correctly to None at the Python level. Classes that don't want to inherit the default object.tp_hash implementation can then write "hash = None" or "(hashfunc)Py_None" to block the inheritance.

msg68948 - (view)

Author: Alyssa Coghlan (ncoghlan) * (Python committer)

Date: 2008-06-29 12:25

Attached patch allows classes to explicitly block inheritance of object.hash by setting the tp_hash slot to Py_None or the hash attribute to None.

This is similar to the approach used in Py3k, but allows hash to be inherited by default, with classes explicitly disabling it as appropriate. I'd also suggest forward porting this approach to Py3k as well, and then possibly look at replacing a copied tp_hash slot with Py_None when the inherited tp_hash is object.tp_hash, but tp_cmp or tp_richcompare are different.

msg68949 - (view)

Author: Alyssa Coghlan (ncoghlan) * (Python committer)

Date: 2008-06-29 13:30

Unassigning for the moment - I'll take a look at adding the Py3k warning back in at some point, but the main thing I would like now is a sanity check from Guido or Barry (or anyone else happy to dig into the guts of typeobject.c) that the basic idea behind my fix is sound.

It would be nice to avoid the explicit check for Py_None in PyObject_Compare by getting update_one_slot() to replace the Py_None it encounters in hash with the default slot_tp_hash implementation, but my attempts at doing that have all led to segfaults.

(Also, glyph, I'd love to know if my patch helps to clear up Django's issues with 2.6b1)

msg69096 - (view)

Author: Alyssa Coghlan (ncoghlan) * (Python committer)

Date: 2008-07-02 14:42

Suggestion from GvR (one I like): instead of re-using Py_None, add a new C function that is stored in the tp_hash slot as a sentinel instead of the Py_None value used in the posted version of the patch. This will avoid breaking code that just checks for NULL before calling the tp_hash slot directly, while still allowing typeobject.c to detect that the object isn't actually hashable despite the presence of a non-NULL value in the tp_hash slot.

(I'll keep the None at the Python level though, since that matches the Py3k behaviour and plays nicely with collections.Hashable)

msg69101 - (view)

Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer)

Date: 2008-07-02 15:15

I don't like the changes in listobject.c and dictobject.c: they seem to imply that all unhashable types must be modified when upgrading to 2.6.

msg69129 - (view)

Author: Alyssa Coghlan (ncoghlan) * (Python committer)

Date: 2008-07-02 21:43

Unhashable types that implement an "I'm not hashable" hash method will indeed require modification in 2.6 in order to avoid incorrectly passing an "isinstance(obj, collections.Hashable)" check (note that several of the mutable standard library types such as collections.deque are incorrectly detected as hashable in the current SVN trunk).

msg69137 - (view)

Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer)

Date: 2008-07-02 22:32

Isn't that a serious compatibility issue? 2.5 code should work on 2.6 with minimal effort.

For deque objects, I don't get your point:

Python 2.6b1+ (trunk, Jul 1 2008, 22:35:48) [MSC v.1500 32 bit Intel)] on win32

from collections import deque hash(deque()) TypeError: deque objects are unhashable

msg69190 - (view)

Author: Alyssa Coghlan (ncoghlan) * (Python committer)

Date: 2008-07-03 10:50

As far as deque goes, the following behaviour on the trunk is the problem I am trying to fix:

Python 2.6b1+ (trunk:64655, Jul 2 2008, 22:48:24) [GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] on linux2 Type "help", "copyright", "credits" or "license" for more information.

from collections import deque, Hashable isinstance(deque(), Hashable) True

All of the container types that my patch touches were already unhashable in 2.5 - my patch just ensures that they all correctly return false for isinstance(obj, collections.Hashable) even after we make object.hash inherited by default again. This is also the reason why simply reverting the trunk hash implementation to the 2.5 behaviour (which is what I first tried) is inadequate.

Since collections.Hashable is new in 2.6, I can live with it returning the wrong answer for types which define hash to always raise an error (that's a known limitation of the ABC system, even in Py3k). However, we should at least make sure it returns the correct answer for all of the types in the standard library.

msg69324 - (view)

Author: Alyssa Coghlan (ncoghlan) * (Python committer)

Date: 2008-07-06 07:52

Attached updated patch which implements Guido's suggestion from above (inheritance of hash is blocked at the C level by setting the slot to PyObject_HashNotImplemented and at the Python level by setting hash = None).

All tests which don't depend on a -u resource flag still pass (currently running a -uall set of tests as well).

msg69325 - (view)

Author: Alyssa Coghlan (ncoghlan) * (Python committer)

Date: 2008-07-06 08:09

Two failures in the -uall run (test_codecmaps_hk, test_linuxaudiodev). However, I see those test failures even after reverting my changes, so they aren't related to this.

msg69691 - (view)

Author: Alyssa Coghlan (ncoghlan) * (Python committer)

Date: 2008-07-15 15:49

Fixed for 2.6 in r64962 and the underlying PyObject_HashNotImplemented mechanic forward ported to 3.0 in r64967.

Still need to update the C API documentation and the library reference, as well as implement the Py3k warning in 2.6, but I won't get to those for this beta - dropping priority to deferred blocker.

msg70748 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2008-08-05 16:06

How's this going?

msg70777 - (view)

Author: Alyssa Coghlan (ncoghlan) * (Python committer)

Date: 2008-08-06 09:44

The documentation and Py3k warning will be ready for the third beta next week (the remaining part is a lot easier than the initial fix).

msg71021 - (view)

Author: Alyssa Coghlan (ncoghlan) * (Python committer)

Date: 2008-08-11 15:48

Py3k warnings for this have been checked in on the trunk as r65642.

There's an unfortunate problem with having to define a fixed stack level for the warnings - for classes with a metaclass implemented in Python, the warning message will point to the new method of the metaclass instead of to the first line of the offending class definition. I don't see a way around this problem, but it definitely caused me grief in tracking down some of the misbehaving classes in the standard library that use ABCMeta as their metaclass in 2.6 (I actually cheated and recompiled with the stack level on the warnings set to 2 instead of 1).

The update also eliminates the spurious and not-so-spurious Py3k warnings that this update triggered in the test suite under the -3 flag. Most were just test suite classes that happen to become unhashable in Py3k, but a few were classes in the standard lib itself that defined eq or cmp methods that were incompatible with the default hash implementation (collections.Mapping, collections.Set, unittest.TestSuite, xml.dom.minidom.NamedNodeMap, numbers.Number, UserList.UserList). Instances of all of the affected classes are now explicitly unhashable in 2.x as well as 3.x (note that any code which relied on any of them being hashable was already broken due to the fact that these classes didn't provide the correct hash and equality invariants).

The numbers.Number change deserves special mention - the actual warning occurs further down in the number stack (when numbers.Complex defines a new eq method), but it seemed appropriate to block the inheritance of object.hash in Number since an id() based hash isn't appropriate for any numeric type. This particular aspect of the change should probably be forward ported to Py3k.

In implementing the warnings, I'm also pretty happy with the current Py3k approach that overriding cmp or eq means that hash isn't inherited at all, rather than restricting the block solely to object.hash. The current behaviour is simple to explain, and more importantly, I think it is more correct - any time you change the definition of equality for the class, you really do need to think carefully about what it means for mutability and hashability.

P.S. I should never have said this part of the change was going to be easy, because that was just begging old Murphy to come slap me upside the head...

msg71022 - (view)

Author: Alyssa Coghlan (ncoghlan) * (Python committer)

Date: 2008-08-11 15:53

I still intend to do the necessary documentation updates for this, but they're going to be delayed a bit since getting the warnings right took much longer than I expected.

msg71223 - (view)

Author: Benjamin Peterson (benjamin.peterson) * (Python committer)

Date: 2008-08-16 16:23

Let's lower the priority a little then.

msg71325 - (view)

Author: Alyssa Coghlan (ncoghlan) * (Python committer)

Date: 2008-08-18 13:21

numbers.Number change forward ported to Py3k in r65808

Docs updated for 2.6 and 3.0 in r65810 and r65811 respectively.

Which means I can finally close this one :)

msg72114 - (view)

Author: Alyssa Coghlan (ncoghlan) * (Python committer)

Date: 2008-08-28 21:48

Reopening - still need to fix the Python level docs for hash() and hash().

msg72207 - (view)

Author: Alyssa Coghlan (ncoghlan) * (Python committer)

Date: 2008-08-31 13:22

hash() documentation fixed for 2.6 in r66085 and for 3.0 in r66086.

History

Date

User

Action

Args

2022-04-11 14:56:31

admin

set

github: 46488

2008-08-31 13:22:05

ncoghlan

set

status: open -> closed
resolution: fixed
messages: +

2008-08-28 21:48:38

ncoghlan

set

status: closed -> open
resolution: fixed -> (no value)
messages: +

2008-08-18 13:21:15

ncoghlan

set

status: open -> closed
resolution: fixed
messages: +

2008-08-16 16:23:58

benjamin.peterson

set

priority: release blocker -> critical
nosy: + benjamin.peterson
messages: +

2008-08-11 15:53:44

ncoghlan

set

messages: +

2008-08-11 15:48:31

ncoghlan

set

messages: +

2008-08-06 09:44:26

ncoghlan

set

messages: +

2008-08-05 16:06:41

gvanrossum

set

messages: +

2008-07-18 03:45:40

barry

set

priority: deferred blocker -> release blocker

2008-07-15 15:49:22

ncoghlan

set

priority: release blocker -> deferred blocker
messages: +

2008-07-06 08:09:53

ncoghlan

set

messages: +

2008-07-06 07:52:36

ncoghlan

set

files: + issue2235_fix_hash_inheritance_with_function_ptr.diff
messages: +

2008-07-03 10:50:33

ncoghlan

set

assignee: ncoghlan
messages: +

2008-07-02 22:32:05

amaury.forgeotdarc

set

messages: +

2008-07-02 21:43:23

ncoghlan

set

messages: +

2008-07-02 15:15:24

amaury.forgeotdarc

set

messages: +

2008-07-02 14:42:18

ncoghlan

set

messages: +

2008-06-29 13:30:52

ncoghlan

set

assignee: ncoghlan -> (no value)
messages: +

2008-06-29 12:25:39

ncoghlan

set

files: + issue2235_fix_hash_inheritance.diff
messages: +

2008-06-29 08:34:33

ncoghlan

set

messages: +

2008-06-28 01:02:45

ncoghlan

set

assignee: ncoghlan

2008-06-28 00:56:16

ncoghlan

set

messages: +

2008-06-27 18:05:09

gvanrossum

set

messages: +

2008-06-27 13:23:19

ncoghlan

set

nosy: + ncoghlan
messages: +

2008-06-26 17:06:12

gvanrossum

set

assignee: gvanrossum -> (no value)
messages: +

2008-06-26 16:58:44

schmir

set

nosy: + schmir

2008-06-26 16:50:23

glyph

set

priority: critical -> release blocker
nosy: + glyph
messages: +

2008-05-08 03:43:12

barry

set

priority: release blocker -> critical
nosy: + barry
messages: +

2008-04-15 05:56:19

nnorwitz

set

assignee: amaury.forgeotdarc -> gvanrossum

2008-04-10 23:44:06

amaury.forgeotdarc

set

files: + inherit_hash3.patch
messages: +

2008-04-10 16:51:50

gvanrossum

set

messages: +

2008-04-10 16:47:39

gvanrossum

set

priority: high -> release blocker
messages: +

2008-04-10 16:46:57

gvanrossum

set

assignee: amaury.forgeotdarc
messages: +

2008-03-16 21:00:11

eikeon

set

nosy: + eikeon

2008-03-05 00:48:31

amaury.forgeotdarc

set

files: + inherit_hash2.patch
messages: +

2008-03-05 00:12:29

gvanrossum

set

messages: +

2008-03-05 00:05:33

amaury.forgeotdarc

set

messages: +

2008-03-05 00:00:51

amaury.forgeotdarc

set

files: + inherit_hash.patch
keywords: + patch
messages: +

2008-03-04 22:20:13

rhettinger

set

priority: high

2008-03-04 22:17:35

gvanrossum

set

messages: +

2008-03-04 22:15:27

rhettinger

set

nosy: + rhettinger
messages: +

2008-03-04 21:56:09

amaury.forgeotdarc

set

nosy: + amaury.forgeotdarc, gvanrossum
type: crash -> behavior
messages: +

2008-03-04 19:49:23

jek

create