Issue 35983: tp_dealloc trashcan shouldn't be called for subclasses (original) (raw)

Created on 2019-02-13 11:02 by jdemeyer, last changed 2022-04-11 14:59 by admin.

Pull Requests
URL Status Linked Edit
PR 11841 merged jdemeyer,2019-02-13 15:52
PR 12607 closed jdemeyer,2019-03-28 16:27
PR 12699 closed jdemeyer,2019-04-05 13:40
PR 12730 closed jdemeyer,2019-04-08 14:56
Messages (19)
msg335405 - (view) Author: Jeroen Demeyer (jdemeyer) * (Python triager) Date: 2019-02-13 11:02
When designing an extension type subclassing an existing type, it makes sense to call the tp_dealloc of the base class from the tp_dealloc of the subclass. Now suppose that I'm subclassing "list" which uses the trashcan mechanism. Then it can happen that the tp_dealloc of list puts this object in the trashcan, even though the tp_dealloc of the subclass has already been called. Then the tp_dealloc of the subclass may be called multiple times, which is unsafe (tp_dealloc is expected to be called exactly once). To solve this, the trashcan mechanism should be disabled when tp_dealloc is called from a subclass. Interestingly, subtype_dealloc also solves this in a much more involved way (see the "Q. Why the bizarre (net-zero) manipulation of _PyRuntime.trash_delete_nesting around the trashcan macros?") in Objects/typeobject.c
msg335406 - (view) Author: Stéphane Wirtel (matrixise) * (Python committer) Date: 2019-02-13 11:08
Jeroen, could you share your example? I am learning the C-API of Python and this example could be interesting.
msg335440 - (view) Author: Jeroen Demeyer (jdemeyer) * (Python triager) Date: 2019-02-13 14:00
The problem is easily reproduced with Cython: cdef class List(list): cdef int deallocated def __dealloc__(self): if self.deallocated: print("Deallocated twice!") self.deallocated = 1 L = None for i in range(10**4): L = List((L,)) del L
msg335453 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2019-02-13 15:34
FWIW, subclassing builtin types is a relatively new thing. There are still a number of lingering (older) implementation details throughout CPython that were written assuming no subclassing. I'd guess that this is one of them.
msg335455 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2019-02-13 15:42
By "relatively new thing", you mean less than 20 years old? :-)
msg335463 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2019-02-13 16:28
On Wed, Feb 13, 2019 at 8:42 AM Antoine Pitrou <report@bugs.python.org> wrote: > Antoine Pitrou <pitrou@free.fr> added the comment: > > By "relatively new thing", you mean less than 20 years old? :-) Yeah, looks like it was in the 2.2 release (Dec 2001) for PEP 253. Anyway, I know of several core devs who are still opposed to subclassing builtin types. :)
msg335476 - (view) Author: Jeroen Demeyer (jdemeyer) * (Python triager) Date: 2019-02-13 20:51
NOTE: also OrderedDict currently uses trashcan hacking to work around this problem: /* Call the base tp_dealloc(). Since it too uses the trashcan mechanism, * temporarily decrement trash_delete_nesting to prevent triggering it * and putting the partially deallocated object on the trashcan's * to-be-deleted-later list. */ --tstate->trash_delete_nesting; assert(_tstate->trash_delete_nesting < PyTrash_UNWIND_LEVEL); PyDict_Type.tp_dealloc((PyObject *)self); ++tstate->trash_delete_nesting; So this seems to be a known problem which deserves to be fixed properly.
msg335513 - (view) Author: Jeroen Demeyer (jdemeyer) * (Python triager) Date: 2019-02-14 09:06
> Jeroen, could you share your example? I am learning the C-API of Python and this example could be interesting. Either use the Cython code I posted above or run the testcase I added in PR 11841.
msg338976 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-03-27 17:25
Disabling the trashcan mechanism returns the problem for solving which the trashcan mechanism was introduced. >>> from _testcapi import MyList >>> L = None >>> for i in range(1000000): ... L = MyList((L,)) ... >>> del L Segmentation fault (core dumped) I think we need better way to resolve this issue. For example, setting the object type to the parent type should solve this issue.
msg338981 - (view) Author: Jeroen Demeyer (jdemeyer) * (Python triager) Date: 2019-03-27 17:48
Yes of course. When not using the trashcan, stuff crashes. I don't get your point...
msg338982 - (view) Author: Jeroen Demeyer (jdemeyer) * (Python triager) Date: 2019-03-27 17:56
To clarify: the purpose of MyList is specifically to check that no double deallocations occur. For this test to make sense, MyList should not use the trashcan itself.
msg339017 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-03-28 06:55
In your example you can add PyTypeObject *tp = Py_TYPE(self); Py_TYPE(self) = &PyList_Type; if (PyType_GetFlags(tp) & Py_TPFLAGS_HEAPTYPE) { Py_DECREF(tp); } before calling PyList_Type.tp_dealloc(). It is possible to add such code directly in PyList_Type.tp_dealloc and other deallocators that use the trashcan mechanism.
msg339021 - (view) Author: Jeroen Demeyer (jdemeyer) * (Python triager) Date: 2019-03-28 08:42
Changing types like that looks like an ugly hack and a recipe for breakage. For example, in list_dealloc(), the following needs the type to be correct: if (numfree < PyList_MAXFREELIST && PyList_CheckExact(op)) free_list[numfree++] = op; else Py_TYPE(op)->tp_free((PyObject *)op); Could you please clarify your opinion: do you think that there's something wrong with PR 11841? And if yes: what's wrong with it? Or are you just giving optional suggestions?
msg339022 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-03-28 08:55
Yes, I think that there's something wrong with PR 11841. It disables the trashcan mechanism and does not provide other mechanism for solving that problem.
msg339023 - (view) Author: Jeroen Demeyer (jdemeyer) * (Python triager) Date: 2019-03-28 09:10
> It disables the trashcan mechanism Yes, it disables the trashcan in some cases. But only when using the trashcan mechanism would probably crash CPython anyway due to a double deallocation. So at the very least, PR 11841 improves things from "crashing whenever the trashcan is used" to "crashing on stack overflows". Do you have a real example where PR 11841 actually makes things *worse*? > and does not provide other mechanism for solving that problem. The recommended thing to do is that the subclass also implements the trashcan. See OrderedDict for an example: both the base class "dict" and the subclass "OrderedDict" use the trashcan.
msg339119 - (view) Author: Jeroen Demeyer (jdemeyer) * (Python triager) Date: 2019-03-29 15:02
I created an additional PR 12607 with some more changes to the, in particular to make the old backwards-compatibility trashcan macros safer. This should be seen as part of the same bugfix but I decided to make a new PR because PR 11841 had several positive reviews already.
msg339519 - (view) Author: Jeroen Demeyer (jdemeyer) * (Python triager) Date: 2019-04-05 20:35
I realized that there is a nasty interaction between the trashcan and __del__: if you're very close to the trashcan limit and you're calling __del__, then objects that should have been deallocated in __del__ (in particular, an object involving self) might instead end up in the trashcan. So self might be resurrected when it shouldn't be. This is causing test failures in the 2.7 backport due to a different implementation of __del__.
msg339609 - (view) Author: Jeroen Demeyer (jdemeyer) * (Python triager) Date: 2019-04-08 10:36
In Python 3, the resurrection issue probably appears too. But it's not so much a problem since __del__ (mapped to tp_finalize) is only called once anyway. So there are no bad consequences if the object is resurrected incorrectly.
msg342080 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2019-05-10 17:21
New changeset 351c67416ba4451eb3928fa0b2e933c2f25df1a3 by Antoine Pitrou (Jeroen Demeyer) in branch 'master': bpo-35983: skip trashcan for subclasses (GH-11841) https://github.com/python/cpython/commit/351c67416ba4451eb3928fa0b2e933c2f25df1a3
History
Date User Action Args
2022-04-11 14:59:11 admin set github: 80164
2019-05-15 14:34:02 ned.deily set versions: - Python 2.7, Python 3.7
2019-05-10 17:21:33 pitrou set messages: +
2019-04-08 14:56:18 jdemeyer set pull_requests: + <pull%5Frequest12653>
2019-04-08 10:36:43 jdemeyer set messages: +
2019-04-05 20:35:14 jdemeyer set messages: +
2019-04-05 13:40:42 jdemeyer set pull_requests: + <pull%5Frequest12624>
2019-03-29 15:02:37 jdemeyer set messages: +
2019-03-28 16:27:12 jdemeyer set pull_requests: + <pull%5Frequest12546>
2019-03-28 09:10:13 jdemeyer set messages: +
2019-03-28 08:55:47 serhiy.storchaka set messages: +
2019-03-28 08:42:13 jdemeyer set messages: +
2019-03-28 06:55:54 serhiy.storchaka set messages: +
2019-03-27 17:56:18 jdemeyer set messages: +
2019-03-27 17:48:10 jdemeyer set messages: +
2019-03-27 17:25:35 serhiy.storchaka set messages: +
2019-03-27 14:40:46 serhiy.storchaka set nosy: + serhiy.storchaka
2019-02-14 09:06:42 jdemeyer set messages: +
2019-02-13 20:51:51 jdemeyer set messages: +
2019-02-13 16:28:05 eric.snow set messages: +
2019-02-13 15:52:37 jdemeyer set keywords: + patchstage: test needed -> patch reviewpull_requests: + <pull%5Frequest11871>
2019-02-13 15:42:39 pitrou set messages: +
2019-02-13 15:34:50 eric.snow set nosy: + eric.snowmessages: + type: behaviorstage: test needed
2019-02-13 14:00:11 jdemeyer set messages: +
2019-02-13 11:08:00 matrixise set nosy: + matrixisemessages: +
2019-02-13 11:02:31 jdemeyer create