Issue 1536021: hash(method) sometimes raises OverflowError (original) (raw)

I've run into a problem with a big application that I wasn't able to reproduce with a small example.

The code (exception handler added to demonstrate and work around the problem):

        try :
            h = hash(p)
        except OverflowError, e:
            print type(p), p, id(p), e
            h = id(p) & 0x0FFFFFFF

prints the following output:

<type 'instancemethod'> <bound method Script_Category.is_applicable of <Script_Menu_Mgr.Script_Category object at 0xb6cb4f8c>> 3066797028 long int too large to convert to int

This happens with Python 2.5b3, but didn't happen with Python 2.4.3.

I assume that the hash-function for function/methods returns the id of the function. The following code demonstrates the same problem with a Python class whose __hash__ returns the id of the object:

$ python2.4 Python 2.4.3 (#1, Jun 30 2006, 10:02:59) [GCC 3.4.6 (Gentoo 3.4.6-r1, ssp-3.4.5-1.0, pie-8.7.9)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> class X(object): ... def hash(self): return id(self) ... >>> hash (X()) -1211078036 $ python2.5 Python 2.5b3 (r25b3:51041, Aug 7 2006, 15:35:35) [GCC 3.4.6 (Gentoo 3.4.6-r1, ssp-3.4.5-1.0, pie-8.7.9)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> class X(object): ... def hash(self): return id(self) ... >>> hash (X()) Traceback (most recent call last): File "", line 1, in OverflowError: long int too large to convert to int

Logged In: YES user_id=4771

The hash of instance methods changed, and id() changed to return non-negative numbers (so that id() is not the default hash any more). But I cannot see how your problem shows up. The only thing I could imagine is that the Script_Category class has a custom hash() method which returns a value that is sometimes a long, as it would be if it were based on id(). (It has always been documented that just returning id() in custom hash() methods doesn't work because of this, but on 32-bit machines the problem only became apparent with the change in id() in Python 2.5.)

Logged In: YES user_id=2402

The only thing I could imagine is that the Script_Category class has a custom hash() method which returns a value that is sometimes a long, as it would be if it were based on id().

That was indeed the problem in my code (returning id(self)).

It has always been documented that just returning id() in custom hash() methods doesn't work because of this

AFAIR, it was once documented that the default hash value is the id of an object. And I just found a message by the BFDL himself proclaiming so: http://python.project.cwi.nl/search/hypermail/python-recent/0168.html.

OTOH, I don't remember seeing anything about this in AMK's What's new in Python 2.x documents (but found an entry in NEWS.txt for some 2.5 alpha).

I've now changed all my broken __hash__ methods (not that many fortunately) but it might be a good idea to document this change in a more visible way.