[Python-Dev] collections module (original) (raw)
Tim Peters tim.one at comcast.net
Sat Jan 10 15:05:26 EST 2004
- Previous message: [Python-Dev] collections module
- Next message: [Python-Dev] collections module
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
[Martin v. Loewis]
Shouldn't you malloc() in the case of prepending, and do the copying yourself? r ealloc can only avoid copying if it can append a free block. For the list, we know
a) realloc is unlikely to find a new block, anyway, as it is pymalloc, which never extends a small block.
[Guido]
Not for large lists. pymalloc passes large objects off to real malloc.
pymalloc is irrelevant here: it's not used for list guts, just for list headers. This is an efficiency hack, to speed the append-at-the-end use case: as Martin says, pymalloc cannot extend in-place, so not using pymalloc saves the cycles pymalloc would have consumed copying small list guts repeatedly while the list was growing, and, after the list got large, re-deducing over and over that it has to redirect to the system realloc. This expense is exacerbated because Python doesn't keep track of how much it's overallocated, it always calls realloc() on an append, (but "almost always" calls realloc() with the same size it called realloc() with the last time).
b) even if realloc would be able to extend the block, we still would have to copy the data.
For really large lists, you'd risk running out of memory due to the malloc(1GB), while realloc(1GB) very likely just extends (since such a large block effectively becomes in a memory segment of its own).
Right, that's the idea.
I don't know whether the overallocation would make similar-sized room at both ends, or whether it would take the current operation into account. For the specific case of queues, we may find that doing just the copying might be sufficient, with no need to go to the allocator.
For example, in this steady-state queue:
q = [x]
while True:
q.append(x)
q.pop(0)
q is never empty then, but never contains more than 1 element.
Hm. An idea: don't bother overallocating when inserting in the front, but do keep the free space in the front if possible.
I'm not clear on what that means, and the devil's in the details.
Then recommend that queues are implemented using append(x) and pop(0), rather than using insert(0, x) and pop().
Given the way realloc() works, I expect that add-at-the-right will always be favored. If someone does "insert at the left", though, and we're out of room, I'd give "the extra space" to the left end (we have to do a memmove in that case regardless, and shifting stuff right N slots doesn't cost more than shifting stuff right 1 slot). There are many possible strategies, and it's worth some pain to make dequeue use of lists reasonably zippy too.
- Previous message: [Python-Dev] collections module
- Next message: [Python-Dev] collections module
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]