bpo-34784: Posixmodule: Heap StructSequence by eduardo-elizondo · Pull Request #9665 · python/cpython (original) (raw)

@encukou @nascheme

I finally updated the change to address all of your comments! I managed to fully clean up all the memory allocations without the need of an extra flag. Here's the rundown of the issues and the fixes:

Leak of PyMemberDef:

StructSequences have to dynamically create the PyMemberDefs that will be passed on to the type's tp_members. PyStructSequence_InitType2 just calls PyMem_New and never cleans up that memory. In my original implementation of PyStructSequence_NewType, I was just replicating the same behavior.

This can be improved by looking at how new types get allocated. If we look into the implementation of type_new, we'll find that tp_members is set by: PyHeapType_GET_MEMBERS. This macro basically looks past the size of the PyHeapTypeObject to pull the PyMemberDefs out. Looking back, we can see that type_new allocates PyHeapTypeObjects by calling (PyTypeObject *)metatype->tp_alloc(metatype, nslots);. This means that it counts the number of PyMemberDefs that the type will need and uses that to allocate extra space past the size of the object. That's how PyHeapType_GET_MEMBERS is able to pull these members out. Finally, that extra space of memory is cleaned up with the whole object, meaning that it doesn't need any extra deallocation step.

Going back to StructSequences, we can easily reuse all of that machinery to clean up the dynamically allocated PyMemberDefs. First, we start by modifying PyType_FromSpecWithBases so that it allocates all the extra space it needs by counting the number of slots that the object will have and passing that to PyType_GenericAlloc. Then, when all the slots are being iterated over, we add a special case for Py_tp_members, just like there is one for Py_tp_doc. Using this, we should be able to copy the dynamically allocated PyMemberDefs into the newly added space of memory at the end of the PyHeapTypeObject. This means that now the type owns it's PyMemberDefs and has no need for the dynamically created ones. Finally, we add a PyMem_FREE after calling PyType_FromSpecWithBases within PyStructSequence_NewType to deallocate the dynamically created PyMemberDefs.

Leak of PyType_Slots:
It turns out that all the slots (modulo Py_tp_doc and Py_tp_members) are static pointers which only get copied over to the appropriate offset in the PyHeapTypeObject. For Py_tp_doc, a new copy is made (as it points to a static literal), and Py_tp_members was addressed above. This means that PyType_Slots is not really needed after running PyType_FromSpecWithBases. Therefore, I moved this into a local variable which just gets cleaned up once it gets out of scope.

Leak of PyType_Spec:
Following the same approach, if we follow each of the uses of PyType_Spec within PyType_FromSpecWithBases we'll find that all the integer values (basicsize, itemsize, and flags) get copied over. slots, as we've already identified, can be easily deleted. Finally, name is mostly being used as a read-only variable during the execution of PyType_FromSpecWithBases (i.e. for error messages). There are only two places where it requires to persist the string. The first, when setting the PyHeapTypeObject's ht_name but that already creates a PyUnicode so it's fine. The second, when setting the type's tp_name. In this case, this only copies the pointer and it works under the condition that the original PyStructSequence_Descr remains alive during the lifetime of the PyHeapTypeObject. Given that all of our PyStructSequence_Descr are static, this should just work. Otherwise, if a use case arrises for a non-static PyStructSequence_Descr, we can just create a copy of the string for the type's tp_name - just like Py_tp_doc. Therefore, I moved this PyType_Spec into a local variable which just gets cleaned up once it gets out of scope.

Testing:
To prove that this doesn't actually leak, I followed Neil's advice and added a new test_capi test. This tracks the refcounts and the allocation blocks to prove that this implementation does not leak.