[Python-Dev] Int FutureWarnings and other 2.4 TODOs (original) (raw)

Tim Peters tim.one at comcast.net
Fri Dec 5 13:05:39 EST 2003


[Tim]

BTW, using pymalloc instead wouldn't eliminate that allocated memory is never returned to the OS. pymalloc grabs large "arenas" from malloc(), and never free()s them. The fundamental difference is that pymalloc is type-agnostic: space gotten from pymalloc for, say, a dict, can later be reused for, say, a string. pymalloc's freelists are segregated by raw memory size rather than by type, and a given chunk of pymalloc memory may even move from one size list to another (any number of times).

[Guido]

But there aren't a lot of other objects that fit in the size of an int, right?

That's true, but doesn't matter .

I know of no other types that are <= 12 bytes on a 32-bit machine, and assuming pymalloc rounds up, only one that is <= 16 (float). There may be some obscure ones (super may be only 16 bytes too) but they'll never need that many instances to fill the void left behind by range (10000000).

(I'm assuming that pymalloc doesn't join adjacent small blocks to form a larger block

That's the part that's missing -- it does, but cleverly and efficiently.

-- that way lies a full malloc reimplementation, and what's the point of that...)

Large "arenas" are cut into many equal-sized "pools", and pools are in turn cut into equal-size objects. All objects within a given pool have the same size. When all the objects in a given pool are freed, the entire pool is available for reuse by objects of any size. There's a little pressure favoring reusing an entirely-free pool for objects of the same size as it was last used for, because that saves some overhead (for example, some size-dependent info is computed and stored in the pool header, and that part can be skipped if the pool is used for objects of the same size as last time). pymalloc will change the size associated with an entirely-free pool before it will ask for another arena, though.

So, e.g., if ints used pymalloc, range(10000000) would carve out enough pools to hold 10 million ints, but when range(10000000) went away almost all those pools would become entirely free, and so available for reuse by objects of any size. The coalescing here is not done on an object-by-object basis. Instead the pool header keeps a count of the number of slots in that pool currently allocated. When an object in the pool is freed, that count is decremented. If the count becomes 0 then, the entire pool can be recycled without further ado, in constant time. If the count does not become 0 then, the newly-freed object is put at the head of a singly-linked list of free objects within the pool, which is also a constant-time operation (IOW, the objects within a pool act a lot like the dedicated int freelist; if the pool is recycled to hold objects of a different size, the freelist links within it are simply ignored and overwritten).

A more-general malloc has to muck around asking whether the objects on either side of a newly-freed object are also free, so that it can coalesce them. pymalloc never does that. OTOH, pymalloc is emphatically not a general-purpose allocator: it doesn't even try to deal with requests for more than about 256 bytes (those get passed on to the system malloc). It relies (for performance) on that the vast bulk of memory requests Python makes are for small chunks of memory. Vladimir did a great job of designing this! Ignoring cache effects, the dedicated int freelist is faster. I'm not sure it's always faster if cache effects are considered too (it's complex, and pymalloc is very cache-conscious).



More information about the Python-Dev mailing list