[Python-Dev] Why not using the hash when comparing strings? (original) (raw)

Duncan Booth duncan.booth at suttoncourtenay.org.uk
Fri Oct 19 12:02:19 CEST 2012


Hrvoje Niksic <hrvoje.niksic at avl.com> wrote:

On 10/19/2012 03:22 AM, Benjamin Peterson wrote:

It would be interesting to see how common it is for strings which have their hash computed to be compared. Since all identifier-like strings mentioned in Python are interned, and therefore have had their hash computed, I would imagine comparing them to be fairly common. After all, strings are often used as makeshift enums in Python. On the flip side, those strings are typically small, so a measurable overall speed improvement brought by such a change seems unlikely.

I'm pretty sure it would result in a small slowdown.

Many (most?) of the comparisons against interned identifiers will be done by dictionary lookups and the dictionary lookup code only tries the string comparison after it has determined that the hashes match. The only time dictionary key strings contents are actually compared is when the hash matches but the pointers are different; it is already the case that if the hashes don't match the strings are never compared.



More information about the Python-Dev mailing list