[Python-Dev] pymalloc killer (original) (raw)
Tim Peters tim.one@comcast.net
Fri, 29 Mar 2002 13:50:54 -0500
- Previous message: [Python-Dev] pymalloc killer
- Next message: [Python-Dev] pymalloc killer
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
I would not like to see such a binary search performed.
Duh .
Instead, if anything needs to be done, I'd be in favour of using a constant-time algorithm, even if it means that a littler more memory overhead is necessary.
I have the following idea: each chunk allocated by ought ^ Allocated by what? Some crucial words are missing here. Note that I am not proposing to change PyMem_{MALLOC, Malloc): that continues to map directly to system malloc(), so we have no control over, nor even knowledge of, the memory allocated by PyMem_{MALLOC, Malloc}. We can't redirect those to pymalloc's malloc without introducing locks to prevent multithreaded insanity, or "declaring by fiat" that existing working code is now horribly broken in ways we can't help users detect.
to have a pointer to its pool, immediately preceding the memory block.
In what fundamental way does this differ from pymalloc's current scheme of storing the pool address at an address calculated from the pointer address? I suspect the real difference resides in the next point:
This will make an overhead of 4 bytes per allocation. Chunks allocated by the system allocator will have a null pointer preceding them.
That's the killer. We have no control over "the system allocator", by which I mean libc's malloc() and realloc(). As above, trying to hijack them is unattractive due to thread issues, even for just the PyMem_{Malloc, MALLOC, Realloc, REALLOC, New, NEW, Resize, RESIZE} spellings.
To deal with alignment, the size classes would increase each by 4 bytes, so they would be spaced at 12, 20, 28, etc. bytes. With the 4 byte overallocation, each chunk would be 8-aligned, if the previous chunk in the same pool is.
If we take over the system malloc, we need also to mimic the platform malloc's alignment. Pymalloc's object allocator can get away with 8-byte alignment because the core never requires worse-than-double alignment, and I'm willing to say that this is a language implementation restriction. I don't believe we can also restrict PyMem_umpteen_ways_to_spell_get_memory that way: they have always given platform-malloc alignment. If we could, I agree your scheme is fine, although I'm not sure why it stores a whole pointer when it's making a 1-bit distinction ("does or does not reside in a pymalloc pool").
This approach would allow to remove the requirement that each pool must be 4k-aligned.
OK, that's why it really stores a whole pointer. Agreed then, but the memory lost to achieve page alignment once per-arena is likely small compared to adding 4 bytes to every allocation unit. The former loses at most 4KB per 256KB.
To support the NULL pointer in a system-malloc'ed chunk, pymalloc would overallocate 8 bytes if it defers to system malloc, to preserve alignment.
What do you think?
Primarily that hijacking the system malloc is fraught with new difficulties. If we can hijack it, so that pymalloc's free always knows that addresses passed to it came from pymalloc's malloc/realloc originally, then I agree it's easy to do something cheap to distinguish pool from non-pool memory. The scheme I sketched is as hairy as it is exactly because it does the right thing even when freeing an address obtained directly from the platform malloc/realloc.
- Previous message: [Python-Dev] pymalloc killer
- Next message: [Python-Dev] pymalloc killer
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]