Python garbage collector — Unofficial Python Development (Victor's notes) documentation (original) (raw)

Reference documentation by Pablo Galindo Salgado: https://devguide.python.org/garbage_collector/

Py_TPFLAGS_HAVE_GC

The garbage collector does not track objects if their type don’t have thePy_TPFLAGS_HAVE_GC flag.

If a type has the Py_TPFLAGS_HAVE_GC flag, when an object is allocated, aPyGC_Head structure is allocated at the beginning of the memory block, butPyObject* points just after this structure. The _Py_AS_GC(obj) macro gets a PyGC_Head* pointer from a PyObject* pointer using pointer arithmetic: ((PyGC_Head *)(obj) - 1).

See also the PyObject_IS_GC() function which uses thePyTypeObject.tp_is_gc slot. An object has the PyGC_Head header ifPyObject_IS_GC() returns true. For a type, the tp_is_gc slot function checks if the type is a heap type (has the Py_TPFLAGS_HEAPTYPE flag): static types don’t have the PyGC_Head header.

Implement the GC protocol in a type

Example of dealloc function:

static void abc_data_dealloc(_abc_data *self) { PyTypeObject *tp = Py_TYPE(self); // ... release resources ... tp->tp_free(self); #if PY_VERSION_HEX >= 0x03080000 Py_DECREF(tp); #endif }

On Python 3.7 and older, Py_DECREF(tp); is not needed: it changed in Python 3.8, see bpo-35810.

PyType_GenericAlloc() allocates memory and immediately tracks the newly created object, even if its memory is uninitialized: its traverse function must support uninitialized objects. Python 3.11 adds a private function_PyType_AllocNoTrack() which allocates memory without tracking an object, so the caller can only track the object (PyObject_GC_Track(self)) once it’s fully initialized, to simplify the traverse function.

&PyBaseObject_Type (without Py_TPFLAGS_HAVE_GC):

&PyType_Type (with Py_TPFLAGS_HAVE_GC):

&PyDict_Type (with Py_TPFLAGS_HAVE_GC):

gc.collect()

CPython uses 3 garbage collector generations. Default thresholds (gc.get_threshold()):

The main function of the GC is gc_collect_main() in Modules/gcmodule.c: it collects objects of a generation. The function relies on the PyGC_Headstructure. Simplified algoritm:

The exact implementation is more complicated.

GC bugs

See also the Python finalization.

Reference cycles