[Python-Dev] collections module (original) (raw)
Delaney, Timothy C (Timothy) tdelaney at avaya.com
Sun Jan 11 21:08:33 EST 2004
- Previous message: [Python-Dev] re: The os module, unix and win32
- Next message: [Python-Dev] [872326] generator expression implementation
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
From: Tim Peters
q = [incoming() for i in range(1000)] while True: q.append(incoming()) consume(q.pop(0)) We currently realloc() q on every append(), rounding up its size, where "its size" is taken from obsize. That would no longer be correct: obsize stays steady at 1000, but append needs ever-larger indices relative to the start of the allocated memory. IOW, while obsize bears a useful relationship to the amount of memory we actually allocate tody, that relationship would no longer hold (obsize would become an arbitrarily poor lower bound). So what are we going to do instead, exactly? We actually allocate enough room for 1024 elements now (although Python doesn't currently remember that). If we start to remember when we're out of room, and memmove() the whole blob left 24 slots when append hits the end of available room, then every 24 (+-1 -- doesn't matter at this level) iterations of the loop, we end up memmove'ing 1000 pointers. If the steady-state backlog were, say, 1023 elements instead, we'd end up memmove'ing 1023 pointers on every loop iteration, and each append becomes a linear-time operation.
Hmm - could we do both? We could choose what to do based on the percentage of the memory block actually in use at the time we hit the end of the block. For example, if we hit the end of the block, and over 75% of it is used, we realloc, then memmove to the beginning of the block. If less than 75% of the memory block is used, we just do the memmove.
So in the above case, when we've got 1024 elements allocated, with 1000 entries in the list, we hit the end. There's more than 75% of the block in use, so we realloc to (e.g.) 2048 elements and memmove the 1000 entries to the start of the block. After another 1048 times around the loop we hit the end of the block again. This time there's less than 75% of the block used, so we just memmove all the entries to the start of the block. This then becomes a new steady state, but instead of memmoveing 1000 pointers every 24 times round the loop, we memmove 1000 pointers every 1048 times round.
I think that takes care of the steady-state problem (or at least reduces the problem by an order of magnitude or 2). The question is whether we would end up with more or less memory in use in the normal case. I suspect we wouldn't.
Tim Delaney
- Previous message: [Python-Dev] re: The os module, unix and win32
- Next message: [Python-Dev] [872326] generator expression implementation
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]