[Python-Dev] [Python-checkins] r43358 - python/trunk/Modules/itertoolsmodule.c (original) (raw)
Tim Peters tim.peters at gmail.com
Wed Mar 29 01:13:47 CEST 2006
- Previous message: [Python-Dev] [Python-checkins] r43358 - python/trunk/Modules/itertoolsmodule.c
- Next message: [Python-Dev] [Python-checkins] r43358 - python/trunk/Modules/itertoolsmodule.c
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
[Thomas Wouters]
... The cycle this nested generator creates, which is also involved in the testtee leak, is not cleanable by the cycle-gc, and it looks like it hasn't been since the yield-expr/coroutine patch was included in the trunk.
That could very well be. Adding finalizers to generators could legitimately make some cycles "suddenly" uncollectable. There was a burst of list traffic about this on the 18th and 19th of June, 2005, under subject:
gcmodule issue w/adding __del__ to generator objects
I starred it in gmail at the time (which is why I was able to find it again ;-)), but never had time to pay attention.
I believe the culprit to be this part of that patch:
Index: Modules/gcmodule.c =================================================================== --- Modules/gcmodule.c (revision 39238) +++ Modules/gcmodule.c (revision 39239) @@ -413,10 +413,8 @@ assert(delstr != NULL); return PyInstanceLookup(op, delstr) != NULL; } - else if (PyTypeHasFeature(op->obtype, PyTPFLAGSHEAPTYPE)) + else return op->obtype->tpdel != NULL; - else - return 0; }
Right, that's the patch that taught gc that generators now have finalizers.
The original code can be read in two ways:
As a whole, it was an implementation of:
Only instances of old- or new-style classes can have finalizers. An instance of an old-style class has a finalizer iff it has a method named "__del__". An instance of a new-style class (PyType_HasFeature(op->ob_type, Py_TPFLAGS_HEAPTYPE) has a finalizer iff its tp_del is non-NULL.
Because of the obscure gimmicks that try to cater to using old binary extension modules across new Python releases without recompiling, there's no guarantee that the tp_del slot even exists, and therefore we don't try to access tp_del at all unless PyType_HasFeature says the type object was compiled recently enough so that we know tp_del does exist.
When generators grew finalizers, the "Only instances of ... classes can have finalizers" part of #1 became wrong, and I expect that's why Phillip removed the conditional. It was the right thing to do from that point of view.
I don't understand the #2 gimmicks well enough to guess whether it was actually safe to remove all guards before trying to access tp_del (I run on Windows, and compiled extensions can never be reused across Python minor releases on Windows -- if a problem was introduced here, I'll never see it).
Now, reverting that part of the patch in revision 39239
There may be a reason to change that patch (due to #2 above), but generators do have finalizers now and the #1 part must not be reverted (although it may be fruitful to complicate it ;-)).
triggers an assertion failure, but removing the assertion makes it work right;
No, it's not right if has_finalizer(g) returns 0 for all generators g.
the above nested-generator cycle gets cleaned up again. Later in the trunk, the assertion was changed anyway, and it no longer fails if I just revert the gcmodule change. Of course, the reason the cycle is uncleanable is because generators grew a tpdel method, and that throws cycle-gc in a hissy fit;
If by "hissy fit" you mean "gcmodule properly moves generators to the list of objects with finalizers, thereby preventing catastrophes", right, that's an intended hissy fit ;-)
reverting the above patch just makes cycle-gc ignore the tpdel of heaptypes. I don't know enough about the cycle-gc to say whether that's good or bad,
Ignoring all generators' tp_del would be disastrous (opens pure-Python code to segfaults, etc).
but not having generators be cleanable is not very good.
Not disastrous, though.
For what it's worth, reverting the gcmodule patch doesn't seem to break any tests.
I believe that. gc disasters are typically very difficult to provoke --, the first time :-)
However, fixing both those things (itertools.tee lack of GC, and GC's inability to clean generators) *still does not fix the 254 refleaks in testgenerators*. When I backport the itertools.tee-GC changes to r39238 (one checkin before coroutines), testgenerators doesn't leak, neither the r39238 version of testgenerators, nor the trunk version. One revision later, r39239, either testgenerators leaks 15 references. 'Fixing' gcmodule, which fixes the nested-generator leak, does not change the number of leaks. And, as you all may be aware of, in the trunk, testgenerators leaks 254 references; this is also the case with the gcmodule fix. So there's more huntin' afoot.
In the mean time, if people knowledgeable about the cycle-GC care to comment about the gcmodule change wrt. heaptypes, please do.
Did the best I could above, short of explaining exactly how failing to identify an object with a finalizer can lead to disaster. Short course: gc obviously needs to know which objects are and are not trash. If a finalizer is associated with an object in cyclic trash, then trash objects are potentially visible from the finalizer, and therefore can be resurrected by running the finalizer. gc must therefore identify all trash objects reachable from trash objects with finalizers, because running finalizers may make all such objects "not trash" again. Getting any part of that wrong can lead to calling tp_clear on an object that a finalizer actually resurrects, leading to symptoms from "hey, all the attributes on my object suddenly vanished by magic, for no reason at all" to segfaults.
- Previous message: [Python-Dev] [Python-checkins] r43358 - python/trunk/Modules/itertoolsmodule.c
- Next message: [Python-Dev] [Python-checkins] r43358 - python/trunk/Modules/itertoolsmodule.c
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]