Algorithm for finalizing cycles (Re: [Python-Dev] Garbage collecting closures) (original) (raw)

Tim Peters tim_one@email.msn.com
Tue, 15 Apr 2003 00:02:02 -0400


[Greg Ewing]

... What bothers me, though, is that even with finalizers, the library writer still can't guarantee eventual reclamation. The application can unwittingly stuff it all up by creating cycles, and there's nothing the library writer can do about it.

They're not trying very hard, then -- and, admittedly, most don't. For example, every time the library grabs a resource that needs finalization, it can plant a weakref to it in a singleton private module object with a del method. When the module is torn down at shutdown, that object's del gets called via refcount-falls-from-1-to-0 (it's a private object -- the library author can surely guarantee it isn't in a cycle), and free whichever resources still exist then. The library could instead register a cleanup function via atexit(). Or it could avoid weakrefs by setting up a thread that wakes up every now and again, to scan gc.garbage for instances of the objects it passed out. Finding one, it could finalize the resources held by the object, mark the object as no longer needing resource finalization, and letting the object leak. And so on -- Python supplies lots of ways to get what you want even here.

It seems to me that giving up on finalization altogether in the presence of cycles is too harsh. In most cases, the cycle isn't actually going to make any difference. With a cycle of your abovementioned file-descriptor-holding objects, for example, could be finalized in an arbitrary order, because the finalizers don't depend on any other objects in the cycle.

I expect that's usually so, but that detecting that it's so is intractable. Even if we relied on programmers declaring their beliefs explicitly, Python still has to be paranoid enough to avoid crashing if the stated beliefs aren't really true. For example, if you fight your way through the details of Java's Byzantine finalization scheme, you'll find that the hairiest parts of it exist just to ensure that Java's gc internals never end up dereferencing dangling pointers. This has the added benefit that most experienced Java programmers appear to testify that Java's finalizers are useless .

So maybe there should be some way of classifying the references held by an object into those that are relied upon by its finalizer, and those that aren't.

How? I believe this is beyond realistic automated analysis for Python source.

The algorithm would then be to first go through and clear all the references that aren't needed by finalizers, and then... [assuming there's no problem leads to the conclusion there's no problem ]

You probably need also to detect that the finalizer can't resurrect the object either, else clearing references that aren't needed specifically for finalization would leave the resurrected object in a damaged state.