[Python-Dev] Saving the hash value of tuples (original) (raw)

Noam Raphael noamraph at gmail.com
Sat Apr 1 22:32:19 CEST 2006


Hello,

I've found out that the hash value of tuples isn't saved after it's calculated. With strings it's different: the hash value of a string is calculated only on the first call to hash(string), and saved in the structure for future use. Saving the value makes dict lookup of tuples an operation with an amortized cost of O(1).

Saving the hash value means that if an item of the tuple changes its hash value, the hash value of the tuple won't be changed. I think it's ok, since:

  1. Hash value of things shouldn't change.
  2. Dicts assume that too.

I tried the change, and it turned out that I had to change cPickle a tiny bit: it uses a 2-tuple which is allocated when the module initializes to lookup tuples in a dict. I changed it to properly use PyTuple_New and Py_DECREF, and now the complete test suite passes. I run test_cpickle before the change and after it, and it took the same time (0.89 seconds on my computer).

What do you think? I see three possibilities:

  1. Nothing should be done, everything is as it should be.
  2. The cPickle module should be changed to not abuse the tuple, but there's no reason to add an extra word to the tuple structure and break binary backwards compatibility.
  3. Both should be changed.

I will be happy to send a patch, if someone shows interest.

Have a good day, Noam



More information about the Python-Dev mailing list