bpo-34784: Posixmodule: Heap StructSequence by eduardo-elizondo · Pull Request #9665 · python/cpython (original) (raw)
I finally updated the change to address all of your comments! I managed to fully clean up all the memory allocations without the need of an extra flag. Here's the rundown of the issues and the fixes:
Leak of PyMemberDef:
StructSequences have to dynamically create the PyMemberDefs
that will be passed on to the type's tp_members. PyStructSequence_InitType2
just calls PyMem_New
and never cleans up that memory. In my original implementation of PyStructSequence_NewType
, I was just replicating the same behavior.
This can be improved by looking at how new types get allocated. If we look into the implementation of type_new
, we'll find that tp_members
is set by: PyHeapType_GET_MEMBERS
. This macro basically looks past the size of the PyHeapTypeObject
to pull the PyMemberDefs
out. Looking back, we can see that type_new
allocates PyHeapTypeObjects
by calling (PyTypeObject *)metatype->tp_alloc(metatype, nslots);
. This means that it counts the number of PyMemberDefs
that the type will need and uses that to allocate extra space past the size of the object. That's how PyHeapType_GET_MEMBERS
is able to pull these members out. Finally, that extra space of memory is cleaned up with the whole object, meaning that it doesn't need any extra deallocation step.
Going back to StructSequences, we can easily reuse all of that machinery to clean up the dynamically allocated PyMemberDefs
. First, we start by modifying PyType_FromSpecWithBases
so that it allocates all the extra space it needs by counting the number of slots that the object will have and passing that to PyType_GenericAlloc
. Then, when all the slots are being iterated over, we add a special case for Py_tp_members
, just like there is one for Py_tp_doc
. Using this, we should be able to copy the dynamically allocated PyMemberDefs
into the newly added space of memory at the end of the PyHeapTypeObject
. This means that now the type owns it's PyMemberDefs
and has no need for the dynamically created ones. Finally, we add a PyMem_FREE
after calling PyType_FromSpecWithBases
within PyStructSequence_NewType
to deallocate the dynamically created PyMemberDefs
.
Leak of PyType_Slots:
It turns out that all the slots (modulo Py_tp_doc
and Py_tp_members
) are static pointers which only get copied over to the appropriate offset in the PyHeapTypeObject
. For Py_tp_doc
, a new copy is made (as it points to a static literal), and Py_tp_members
was addressed above. This means that PyType_Slots
is not really needed after running PyType_FromSpecWithBases
. Therefore, I moved this into a local variable which just gets cleaned up once it gets out of scope.
Leak of PyType_Spec:
Following the same approach, if we follow each of the uses of PyType_Spec
within PyType_FromSpecWithBases
we'll find that all the integer values (basicsize
, itemsize
, and flags
) get copied over. slots
, as we've already identified, can be easily deleted. Finally, name
is mostly being used as a read-only variable during the execution of PyType_FromSpecWithBases
(i.e. for error messages). There are only two places where it requires to persist the string. The first, when setting the PyHeapTypeObject
's ht_name
but that already creates a PyUnicode
so it's fine. The second, when setting the type's tp_name
. In this case, this only copies the pointer and it works under the condition that the original PyStructSequence_Descr
remains alive during the lifetime of the PyHeapTypeObject
. Given that all of our PyStructSequence_Descr
are static, this should just work. Otherwise, if a use case arrises for a non-static PyStructSequence_Descr
, we can just create a copy of the string for the type's tp_name
- just like Py_tp_doc
. Therefore, I moved this PyType_Spec into a local variable which just gets cleaned up once it gets out of scope.
Testing:
To prove that this doesn't actually leak, I followed Neil's advice and added a new test_capi test. This tracks the refcounts and the allocation blocks to prove that this implementation does not leak.