[Python-Dev] collections module (original) (raw)

Tim Peters tim.one at comcast.net
Sat Jan 10 00:26:23 EST 2004


[Guido]

You misunderstood -- the fields I was talking about are counters or pointers that keep track of the overallocation, and 1 or at most 2 is enough.

One would be enough. To avoid breaking huge mounds of code all over the place, I think the existing ob_item should "float", to point always to the current element "at index 0". So ob_item would change when inserting or deleting "at the left end". The only new thing you need then is a field recording the number of unused slots "left-side" slots, between the start of the memory actually allocated and the slot ob_item points at; that field would have to be added or subtracted to/from ob_item in the obvious ways when allocating or deallocating the raw memory, but most of listobject.c could ignore it, and none of the PyList macros would need to change (in particular, the speed of indexed access wouldn't change).

The overallocation strategy gets more complicated, though, and less effective for prepending. If you run out of room when prepending, it's not enough just to ask realloc() for more memory, you also have to memmove() the entire vector, to "make room at the left end". In previous lives I've found that "appending at the right" is as efficient as it is in practice largely because most non-trivial C realloc() calls end up extending the vector in-place, not needing to copy anything. The O() behavior is the same if each boost in size has to move everything that already existed, but the constant factor goes up more than just measurably .



More information about the Python-Dev mailing list