[Python-Dev] pymalloc killer (original) (raw)
Marangozov, Vladimir (Vladimir) vmarangozov@optimay.com
Thu, 4 Apr 2002 10:57:30 +0200
- Previous message: [Python-Dev] ANN: Pyrex - a language for writing Python extension modules
- Next message: [Python-Dev] pymalloc killer
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
[me]
>... > In other words, the lastoffset acts as an upper bound watermark.
[Tim]
I added a bunch of low-level overviews (arean mgmt, pool mgmt, block mgmt) as comment blocks over the last week, and noted the possibility for this in one of them. Another possibility is to get rid of the runtime divisions, and the poolheader capacity member, by exploiting that
pool->capacity = (POOLSIZE - POOLOVERHEAD) / size; is invariant across all pools with the same szidx; i.e., all references to pool->capacity could be replaced by capacity[pool->szidx], where "capacity" is a new small (at most 32 uints) file-static const vector computed at compile-time.
The comparison in the "try to extend the free list" code is executed very frequently, so probing a static vector this way will slow down the thing noticeably. I've tried that. We are very much running after speed here.
[me]
> I didn't want to do that optimization before, because it would have > resulted in a bigger pool header and waste of space. Now it's ok.
[Tim]
So now the tradeoff is a little muddier: If I skipped the multiplication optimization, but did the division optimization, each pool would suddenly gain 8 more bytes (on 32-bit boxes) to use for blocks. If I did both, the poolheader size would remain as it is now, "halfway between" multiple-of-8-byte sizes. How do you judge the tradeoffs now (do only one, or do both)?
Do both, despite that the multiplication optimization is more important. (for one division we have #blocks multiplications for a full pool)
I think that adding back one uint slot (lastoffset) is not a problem (and we end up with the perfect balance of having a pool header layout with 4 pointers and 4 integers ).
Now, with the advent of lastoffset, we can easily get rid off the division as well. For this, we switch to offset calculus for figuring out whether the free list can be extended. This means that 'capacity' becomes 'highoffset'. The 'highoffset' is the highwater not to be exceeded by more than 'size' bytes by 'lastoffest'.
a) the test becomes then: if (pool->lastoffset <= pool->highoffset) ... pool->lastoffset += size; pool->freeblock = (block *)pool +
pool->lastoffset; b) in the pool_init code: pool->lastoffset = POOL_OVERHEAD + size; pool->highoffset = POOL_SIZE - size - size; and you'll need to adjust the code accordingly at the places where you're using 'capacity' per se (and adapt the comments, of course).
This should pretty much settle these murky optimizations for now.
Cheers, Vladimir
- Previous message: [Python-Dev] ANN: Pyrex - a language for writing Python extension modules
- Next message: [Python-Dev] pymalloc killer
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]