[Python-Dev] RFC: PEP 445: Add new APIs to customize Python memory allocators (original) (raw)

Victor Stinner victor.stinner at gmail.com
Wed Jun 19 17:24:21 CEST 2013


2013/6/19 Antoine Pitrou <solipsis at pitrou.net>:

Le Tue, 18 Jun 2013 22:40:49 +0200, Victor Stinner <victor.stinner at gmail.com> a écrit :

Other changes ------------- [...] * Configure external libraries like zlib or OpenSSL to allocate memory using PyMemRawMalloc() Why so, and is it done by default?

(Oh, I realized that PyMem_Malloc() may be used instead of PyMem_RawMalloc() if we are sure that the library will only be used when the GIL is held.)

"is it done by default?"

First, it would be safer to only reuse PyMem_RawMalloc() allocator if PyMem_SetRawMalloc() was called. Just to avoid regressions in Python 3.4.

Then, it depends on the library: if the allocator can be replaced for one library object (ex: expat supports this), it can always be replaced. Otherwise, we should only replace the library allocator if Python is a standalone program (don't replace the library allocator if Python is embedded). That's why I asked if it is possible to check if Python is embedded or not.

"Why so,"

For the "track memory usage" use case, it is important to track memory allocated in external libraries to have accurate reports, because these allocations may be huge.

Only one get/set function for block allocators ----------------------------------------------

Replace the 6 functions: * void PyMemGetRawAllocator(PyMemBlockAllocator *allocator) * void PyMemGetAllocator(PyMemBlockAllocator *allocator) * void PyObjectGetAllocator(PyMemBlockAllocator *allocator) * void PyMemSetRawAllocator(PyMemBlockAllocator *allocator) * void PyMemSetAllocator(PyMemBlockAllocator *allocator) * void PyObjectSetAllocator(PyMemBlockAllocator *allocator) with 2 functions with an additional domain argument: * ``int PyMemGetBlockAllocator(int domain, PyMemBlockAllocator *allocator)`` * ``int PyMemSetBlockAllocator(int domain, PyMemBlockAllocator *allocator)`` I would much prefer this solution.

I don't have a strong preference between these two choices.

Oh, one argument in favor of one generic function is that code using these functions would be simpler. Extract of the unit test of the implementation (_testcapi.c):

With a generic function, this block can be replace with one unique function call.

Drawback: the caller has to check if the result is 0, or handle the error. Or you can just call PyFatalError() if the domain is invalid.

I don't like Py_FatalError(), especially when Python is embedded. It's safer to return -1 and expect the caller to check for the error case.

If an hook is used to the track memory usage, the malloc() memory will not be seen. Remaining malloc() may allocate a lot of memory and so would be missed in reports. A lot of memory? In main()?

Not in main(). The Python expat and zlib modules call directly malloc() and may allocate large blocks. External libraries like OpenSSL or bz2 may also allocate large blocks.

See issues #18203 and #18227.

Victor



More information about the Python-Dev mailing list