gh-98253: Break potential reference cycles in external code worsened by typing.py lru_cache by wjakob · Pull Request #98591 · python/cpython (original) (raw)

@gvanrossum, did you see the discussion in #98253? When I first reported the issue, the actual source of these refleaks was mysterious. Further along the discussion thread, the problem was finally tracked down (see post #98253 (comment))

(Summary: refleaks in extension module tend to hold typed function signatures alive for eternity. This function signature references typing.py internals (LRU caches), which in turn cast a web of references to other parts of the interpreter)

In this way, the caching mechanism in typing.py can turn virtually any refleak in any typed extension module into refleaks elsewhere. Ultimately, it becomes difficult to develop leak-free extensions, because any systematic checking for leaks runs into flukes. I'm creating a framework for extension modules that tries to be well-behaved in this ecosystem, but its internal tests for leaks are hampered by this behavior. Quick tests show that many of the bigger packages (pandas, pytorch, tensorflow) include some refleaks that are problematic in this context.

The patch in this PR is simple and it breaks this problematic chain of references. Of course, one could say: let's leave typing.py as-is and fix all of the other extension modules. I just don't think it is very practical to do so.