[Python-Dev] Rethinking intern() and its data structure (original) (raw)

Peter Otten __peter__ at web.de
Fri Apr 10 10:58:56 CEST 2009


John Arbash Meinel wrote:

Not as big of a difference as I thought it would be... But I bet if there was a way to put the random shuffle in the inner loop, so you weren't accessing the same identical 25k keys internally, you might get more interesting results.

You can prepare a few random samples during startup:

$ python -m timeit -s"from random import sample; d = dict.fromkeys(xrange(107)); nextrange = iter([sample(xrange(107),25000) for i in range(200)]).next" "for x in nextrange(): d.get(x)" 10 loops, best of 3: 20.2 msec per loop

To put it into perspective:

$ python -m timeit -s"d = dict.fromkeys(xrange(10**7)); nextrange = iter([range(25000)]*200).next" "for x in nextrange(): d.get(x)" 100 loops, best of 3: 10.9 msec per loop

Peter



More information about the Python-Dev mailing list