[Python-Dev] iterzip() (original) (raw)

Guido van Rossum guido@python.org
Mon, 29 Apr 2002 14:22:25 -0400


The cause on Win98SE is clear enough with a little instrumentation: in the justzip case, listobject.c's ins1 ends up copying the memory much more often, presumably due to that there's a new tuple allocation after every append (preventing realloc from growing into that space). justpush copies the memory at sizes (skipping some tiny ones, and where this is a count of the number of elements in the list, not the number of bytes (multiply by 4 to get the latter)):

247 639 959 2047 45055 163839 and that's all. justzip copies the memory at sizes: 247 639 959 2047 45055 53247 57343 65535 81919 98303 122879 163839 196607 229375 262143 294911 360447 393215 458751 589823 622591 688127 720895 753663 786431 819199 851967 884735 917503 950271 Copying 3-4MB multiple times at the end ain't cheap. justpush does relatively worse if the order in which they're called is swapped, as then justzip and the custom tuple free list conspire to leave memory in a more-fragmented state.

This suggests that zip() can be sped up considerably by using a hint for the input sequence sizes. If they all respond properly to PySequence_Length(), zip can preallocate the result list to the correct size. It shouldn't believe the numbers fully, but using them as a hint for initial allocation makes a lot of sense, and should greatly reduce the inefficiency.

--Guido van Rossum (home page: http://www.python.org/~guido/)