[Python-Dev] PEP 556 threaded garbage collection & linear recursion in gc (original) (raw)
Tim Peters tim.peters at gmail.com
Thu Mar 28 02:33:29 EDT 2019
- Previous message (by thread): [Python-Dev] PEP 556 threaded garbage collection & linear recursion in gc
- Next message (by thread): [Python-Dev] PEP 556 threaded garbage collection & linear recursion in gc
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
[Gregory P. Smith <greg at krypto.org>]
Good point, I hadn't considered that it was regular common ref count 0 dealloc chaining.
It pretty much has to be whenever you see a chain of XXX_dealloc routines in a stack trace. gcmodule.c never even looks at a tp_dealloc slot directly, let alone directly invoke a deallocation method. That all happens indirectly, as a result of what Py_DECREF does. Then once you're first inside one tp_dealloc method, gc is completely irrelevant - it's that tp_dealloc for the top-level container does its own Py_DECREF on a contained container, which in turn can do its own Py_DECREF on one of its contained containers .... etc. You can get an arbitrarily deep stack of XXX_dealloc calls then, and there's really no other way to get that.
BTW, "container" here is used in a very broad C-level sense, not a high-level Python sense: any PyObject that contains a pointer to a PyObject is "a container" in the intended sense.
The processes unfortunately didn't have faulthandler enabled so there wasn't much info from where in the python code it happened (now fixed).
It's quite possible that the top-level container was Py_DECREF'ed by code in gcmodule.c. But gc gets blamed at first for a lot of stuff that's not actually its fault ;-)
I'll see if anything looks particularly unusual next time I hear of such a report.
The trashcan mechanism is the one and only hack in the code intended to stop unbounded XXX_dealloc stacks, so that's what needs looking at. Alas, it's hard to work with because it's so very low-level, and there's nothing portable that can be relied on about stack sizes or requirements across platforms or compilers.
Some possibilities:
The trashcan code is buggy.
The maximum container dealloc stack depth trashcan intends to allow (PyTrash_UNWIND_LEVEL = 50) is too large for the C stack a thread gets under this app on this platform using this compiler.
One or more of the specific container types involved in this app's dealloc chain doesn't use the trashcan gimmick at all, so is invisible to trashcan's count of how deep the call stack has gotten.
For example, cell_dealloc was in your stack trace, but I see no use of trashcan code in that function (Py_TRASHCAN_SAFE_BEGIN / Py_TRASHCAN_SAFE_END). So the trashcan hack has no idea that cell_dealloc calls are on the stack.
And likewise for func_dealloc.- looks like calls to that are also invisible to the trashcan.
tupledealloc is cool, though.
IIRC, Christian Tismer introduced the trashcan because code only he wrote ;-) was blowing the stack when very deeply nested lists and/or tuples became trash.
From a quick scan of the current code, looks like it was later added to only a few types that aren't container types in the Python sense.
Which may or may not matter here. Your stack trace showed a tupledealloc in one of every three slots, so even if two of every three slots were invisible to the traschcan, the call stack "should have been" limited to a maximum of about PyTrash_UNWIND_LEVEL * 3 = 150 XXX_dealloc functions. But you saw a stack 1000+ levels deep. So something else that isn't immediately apparent is also going on.
- Previous message (by thread): [Python-Dev] PEP 556 threaded garbage collection & linear recursion in gc
- Next message (by thread): [Python-Dev] PEP 556 threaded garbage collection & linear recursion in gc
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]