Message 85884 - Python tracker (original) (raw)

I would support fixing the double call to PyObject_Hash(). For user defined classes with their own hash methods, this is by far the slowest part of the operation.

from my perspective creating an internal SetItem adds another function handling the data structure just as setdefault would

Incorrect comparison. Your in-lining manipulated the ep structure directly (not a good thing). In contrast, adding an alternative _PyDict_SetItemWithHash uses the insertdict() function, fully isolating itself of from the table implementation. The whole purpose of having insertdict() and lookdict() is to isolate the data structure internals from all externally visible functions.

setup = ''' class A(object): def hash(self): return 10 class B(object): pass a = A() b = B() ''' min(Timer('{}.setdefault(a, 0)', setup).repeat(7, 100000)) 0.12840011789208106 min(Timer('{}.setdefault(b, 0)', setup).repeat(7, 100000)) 0.053155359130840907

The double call to a very simple user defined hash adds .07 to call that takes on .05 with the double call to builtin object hash. So, we could save half of the the .07 just by elimating the double call to hash. With more complex hash functions the overall speedup is huge (essentially cutting the total work almost in half).