[Python-Dev] Cyclic GC issues (original) (raw)
Jason Evans jasone at canonware.com
Mon Oct 11 08:59:52 CEST 2004
- Previous message: [Python-Dev] Cyclic GC issues
- Next message: [Python-Dev] Cyclic GC issues
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Sun, Oct 10, 2004 at 11:36:11PM +0200, "Martin v. L?wis" wrote:
>1) I originally had lone rings refer to themselves (obrefcnt started out > at 3; 2 self-references and one reference held by the associated edge).
I would have thought the refcount should have been 4: two self references, plus one from E, plus one from N (maybe I don't understand what make a ring "lone", though).
You understand correctly, except that the R-E-R construct starts out not being attached to any nodes. The "lone" ring term was my attempt to differentiate this:
R
from these:
R-R R---R R-R \ / | | R R-R
> Working around this in Crux requires a lot of extra reference counting > complexity, because there are three different cases for reference > counts, depending on how many members there are in a ring (1, 2, or 3+ > members).
Why is that? Just DECREF every PyObject* in "self", then set the PyObject* to NULL. No need for any additional reference counts.
The low level tree code is implemented completely separately from the Python object wrapper code. This means that, for example, the Python "Node" object does not actually store PyObject* pointers to Ring objects; instead it uses the low level tree code to find out what Ring objects are attached to it. Crux was designed this way in order to be able to implement various low level algorithms such that they never have to call interpreter-related code. As such, reference breaking code must actually tear down the low level tree in such a way that it is always "consistent" (simply setting pointers to NULL doesn't fit well with this way of doing things).
Hmmm. Even if a ring member holds only a single reference to another ring member, would not calling visit() on this member in turn invoke visit() on the other one, which would, in turn, invoke visit() on the next one, and so on?
Yes, this is true. The problem I ran into had to do with the node logically holding references to all the ring objects, but only reporting one of them. That is, the node's tp_traverse function was not reporting all references, though it was reporting enough of them to guarantee that all reachable objects were visited.
At this point, it is clear to me that Python's cyclic GC does not do mark/sweep GC, and that this won't change, so there's probably not much point in discussing this issue further.
>3) A strange thing can happen when tpclear() is called. Suppose that an > edge object is being cleared, and it drops its references to the > associated rings. If obrefcnt of one of the rings drops to 0 as a > result, Python will tpdealloc() the ring right now, without ever > calling tpclear() on the ring. That means that I have to keep track of > whether tpclear() has been called on objects, if it is at all important > that tpclear() be called, so that I can manually do so in tpdealloc(), > if necessary. It is in my opinion reasonable to have cleanup code in > tpclear(), with the assumption that it will be called precisely once, > but Python makes me do extra work to make sure that this happens.
Your expectation is wrong. tpclear is not the place to perform cleanup. It is the place to break cycles. tpdealloc is the place to perform cleanup.
What if breaking cycles and cleanup are the same thing? In the case of Crux, they are one and the same, since the low level tree must be torn down one piece at a time.
Apparently I was mistaken in viewing tp_clear and tp_dealloc as two stages of destruction.
>4) There is no way to control the order in which objects are > tpdealloc()ed.
Correct. Unfortunately, there is no way to change that. For cyclic structures, there is no natural "starting point". Python picks an arbitrary starting point, assuming that the order does not matter. If class objects have an del, it assumes that the order does matter, and gives up - putting the objects into gc.garbage. > This could be addressed in the Python interpreter by paying heed to the > return value of tpdealloc(). If the return value is non-zero, move the > object to the end of the list of objects to be destroyed, so that > destruction is tried later. This allows the module to impose its own > destruction ordering. I don't understand. tpdealloc does not have a return value, so what does "pay heed" mean?
I should have said tp_clear here. In Modules/gcmodule.c:delete_garbage(), the clear() call does not pay attention to the return value. It would be possible to check the return value, and defer deallocation if the return value were non-zero, and ob_refcnt were 1. (However, it would probably have to be a separate list of objects, contrary to what I claimed before.)
Thanks for your comments, Martin. I'm starting to understand the Python GC world view a bit better.
Jason
- Previous message: [Python-Dev] Cyclic GC issues
- Next message: [Python-Dev] Cyclic GC issues
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]