Issue 23107: Tighten-up search loops in sets (original) (raw)

The lookkey functions currently check for an exact key match in the inner search-loop. Move that test to occur after a matching hash is found rather than testing every entry. This gives a modest speed improvement.

--- n = 10,000 --- $ ~/baseline/python.exe -m timeit -s 'from time_tight import s,t,u' 's&t' 1000 loops, best of 3: 396 usec per loop $ ~/tight/python.exe -m timeit -s 'from time_tight import s,t,u' 's&t' 1000 loops, best of 3: 367 usec per loop $ ~/tight/python.exe -m timeit -s 'from time_tight import s,t,u' 's&t' 1000 loops, best of 3: 375 usec per loop $ ~/baseline/python.exe -m timeit -s 'from time_tight import s,t,u' 's&t' 1000 loops, best of 3: 389 usec per loop $ $ ~/baseline/python.exe -m timeit -s 'from time_tight import s,t,u' 's&u' 1000 loops, best of 3: 656 usec per loop $ ~/tight/python.exe -m timeit -s 'from time_tight import s,t,u' 's&u' 1000 loops, best of 3: 657 usec per loop $ ~/baseline/python.exe -m timeit -s 'from time_tight import s,t,u' 's&u' 1000 loops, best of 3: 662 usec per loop $ ~/tight/python.exe -m timeit -s 'from time_tight import s,t,u' 's&u' 1000 loops, best of 3: 642 usec per loop

-- n = 1,000,000 -- $ ~/baseline/python.exe -m timeit -s 'from time_tight import s,t,u' 's&t' 10 loops, best of 3: 67 msec per loop $ ~/tight/python.exe -m timeit -s 'from time_tight import s,t,u' 's&t' 10 loops, best of 3: 48.2 msec per loop $ ~/baseline/python.exe -m timeit -s 'from time_tight import s,t,u' 's&t' 10 loops, best of 3: 59.9 msec per loop $ ~/tight/python.exe -m timeit -s 'from time_tight import s,t,u' 's&t' 10 loops, best of 3: 49.1 msec per loop

$ ~/baseline/python.exe -m timeit -s 'from time_tight import s,t,u' 's&u' 10 loops, best of 3: 173 msec per loop $ ~/tight/python.exe -m timeit -s 'from time_tight import s,t,u' 's&u' 10 loops, best of 3: 152 msec per loop $ ~/baseline/python.exe -m timeit -s 'from time_tight import s,t,u' 's&u' 10 loops, best of 3: 170 msec per loop $ ~/tight/python.exe -m timeit -s 'from time_tight import s,t,u' 's&u' 10 loops, best of 3: 167 msec per loop