[Python-Dev] Locked-in defect? 32-bit hash values on 64-bit builds (original) (raw)

Reid Kleckner reid.kleckner at gmail.com
Fri Oct 15 23:35:25 CEST 2010


On Fri, Oct 15, 2010 at 4:10 PM, Raymond Hettinger <raymond.hettinger at gmail.com> wrote:

On Oct 15, 2010, at 10:40 AM, Benjamin Peterson wrote:

I think the panic is a bit of an overreaction. PEP 384 has still not been accepted, and I haven't seen a final decision about freezing the ABI in 3.2. Not sure where the "panic" seems to be. I just want to make sure the ABI doesn't get frozen before hash functions are converted to Pyssizet. Even if the ABI is nor frozen at 3.2 as Martin has proposed, it would still be great to get this in for 3.2 Fortunately, this doesn't affect everyday users, it only arises for very large datasets.  When it does kick-in though (around 2**32 entries), the degradation is not small, it is close to catastrophic, making dicts/set unusable where O(1) lookups become O(n) with a very large n.

Just to be clear, hashing right now just uses the C long type. The only major platform where sizeof(long) < sizeof(Py_ssize_t) is 64-bit Windows, right? And the change being proposed is to make tp_hash return a Py_ssize_t instead of a long, and then make all the clients of tp_hash compute with Py_ssize_t instead of long?

Reid



More information about the Python-Dev mailing list