Issue 10241: gc fixes for module m_copy attribute (original) (raw)

Created on 2010-10-30 03:48 by nascheme, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
module_m_copy.txt nascheme,2010-10-30 03:48
module_m_copy2.txt nascheme,2010-11-05 23:02
valgrind.out skrah,2013-10-26 19:05
Messages (22)
msg119957 - (view) Author: Neil Schemenauer (nascheme) * (Python committer) Date: 2010-10-30 03:48
I'm trying implement some saner module shutdown procedure (similar to issue 812369). One of the many problems I've run into is that the GC doesn't know about the m_copy attribute of modules. I think the attached patch is correct. tp_tranverse exposes m_copy if it is non-NULL. tp_clear calls Py_CLEAR on it.
msg120063 - (view) Author: Neil Schemenauer (nascheme) * (Python committer) Date: 2010-10-31 15:55
Oops, my patch doesn't work since m_base can be shared by more than one module instance. I guess a different solution would be to cleanup the m_copy references on interpreter shutdown. Somehow they would have to be found though.
msg120553 - (view) Author: Neil Schemenauer (nascheme) * (Python committer) Date: 2010-11-05 23:02
The attached patch seems better. The copies of module dicts stored in the interpreter state are dereferenced when the interpreter shuts down.
msg194108 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-08-01 19:54
Ok, the patch doesn't apply cleanly but looks good regardless. (although I wonder whether you really reclaim significant stuff here: most C extension dicts should be mostly references to C methods and types)
msg194109 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-08-01 20:06
I will still empty modules_by_index rather than Py_CLEAR it, though.
msg194110 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-08-01 20:07
New changeset 62658d9d8926 by Antoine Pitrou in branch 'default': Issue #10241: Clear extension module dict copies at interpreter shutdown. http://hg.python.org/cpython/rev/62658d9d8926
msg194120 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-08-01 21:23
Reopening. This seems to produce segfaults on several of the buildbots, e.g.: http://buildbot.python.org/all/builders/x86%20XP-4%203.x/builds/8947 http://buildbot.python.org/all/builders/x86%20Windows%20Server%202003%20%5BSB%5D%203.x/builds/1257 http://buildbot.python.org/all/builders/x86%20Gentoo%203.x/builds/4671
msg194127 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-08-01 22:32
I managed to get a core dump locally. Interestingly, it may be that the _tk module is the culprit. The backtrace is the following: #0 0x00007ff6c5e14aab in raise () from /lib/x86_64-linux-gnu/libpthread.so.0 #1 0x000000000044e57e in faulthandler_fatal_error (signum=11) at ./Modules/faulthandler.c:305 #2 #3 0x00000000004c085e in _Py_Dealloc (op=<unknown at remote 0x7ff6bca2ac20>) at Objects/object.c:1873 #4 0x00000000004a8085 in free_keys_object (keys=0x3100ab0) at Objects/dictobject.c:378 #5 0x00000000004ab0e8 in dict_dealloc (mp=0x7ff6be6194b8) at Objects/dictobject.c:1400 #6 0x00000000004c0860 in _Py_Dealloc ( op={'children': <unknown at remote 0x7ff6bc9ab3d8>, 'master': None, 'tk': <unknown at remote 0x7ff6bca2ac20>, '_tkloaded': 1, '_tclCommands': None}) at Objects/object.c:1873 #7 0x00000000004d7ac7 in subtype_dealloc ( self=<Tk(children=<unknown at remote 0x7ff6bc9ab3d8>, master=None, tk=<unknown at remote 0x7ff6bca2ac20>, _tkloaded=1, _tclCommands=None) at remote 0x7ff6bc9a7dc0>) at Objects/typeobject.c:1049 #8 0x00000000004c0860 in _Py_Dealloc ( op=<Tk(children=<unknown at remote 0x7ff6bc9ab3d8>, master=None, tk=<unknown at remote 0x7ff6bca2ac20>, _tkloaded=1, _tclCommands=None) at remote 0x7ff6bc9a7dc0>) at Objects/object.c:1873 #9 0x00000000004a8085 in free_keys_object (keys=0x3100e40) at Objects/dictobject.c:378 #10 0x00000000004aac8f in PyDict_Clear (op={}) at Objects/dictobject.c:1291 #11 0x00000000004dd9c8 in type_clear (type=0x14f45b8) at Objects/typeobject.c:2981 #12 0x000000000043a703 in delete_garbage (collectable=0x7fffab364300, old=0x8de7c0 <generations+64>) at Modules/gcmodule.c:854 #13 0x000000000043ac9d in collect (generation=2, n_collected=0x0, n_uncollectable=0x0, nofail=1) at Modules/gcmodule.c:1020 #14 0x000000000043bfa3 in _PyGC_CollectNoFail () at Modules/gcmodule.c:1619 #15 0x00000000005c6cb6 in PyImport_Cleanup () at Python/import.c:420 #16 0x000000000041d992 in Py_Finalize () at Python/pythonrun.c:552 #17 0x000000000042392b in Py_Exit (sts=0) at Python/pythonrun.c:2474 #18 0x000000000042114b in handle_system_exit () at Python/pythonrun.c:1673 #19 0x0000000000421176 in PyErr_PrintEx (set_sys_last_vars=1) at Python/pythonrun.c:1683 #20 0x0000000000420ddd in PyErr_Print () at Python/pythonrun.c:1590 #21 0x0000000000437fc4 in RunModule (modname=0xf4d570 L"test", set_argv0=1) at Modules/main.c:227 #22 0x0000000000439279 in Py_Main (argc=18, argv=0xf4c020) at Modules/main.c:715 #23 0x00000000004199b5 in main (argc=18, argv=0x7fffab364908) at ./Modules/python.c:69
msg194131 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-08-01 22:50
_tkinter uses PyType_FromSpec() to create its types and then adds them to the module dict without increfing them, so there may be a refcount issue explaining that those types get garbage collected before all their instances die.
msg194194 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-08-02 18:40
New changeset 314a872f54e1 by Antoine Pitrou in branch 'default': Backout 62658d9d8926 (issue #10241): it causes a crash at shutdown when deallocating a Tkapp object. http://hg.python.org/cpython/rev/314a872f54e1
msg194195 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-08-02 18:42
I backed out the changeset in time for the Alpha 1 release. Further investigation should perhaps be made on the tkinter issue, but it's a bit cumbersome: you need a full test run in non-parallel mode to exhibit the segfault.
msg194844 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-08-10 22:31
New changeset 1edff836c954 by Antoine Pitrou in branch 'default': Issue #10241: Clear extension module dict copies at interpreter shutdown. http://hg.python.org/cpython/rev/1edff836c954
msg194845 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-08-10 22:31
Re-applied patch after having fixed the tkinter issue.
msg198609 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-09-29 17:24
I'm closing this issue as fixed though I'm not entirely sure the fix is right. The alpha cycle will allow us to get feedback on potential incompatibilities with third-party software.
msg201390 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2013-10-26 19:05
I'm getting a couple of Valgrind leaks, starting with 1edff836c954: valgrind --suppressions=Misc/valgrind-python.supp --leak-check=full ./python -c "pass" All of them seem to be in _PySys_Init().
msg201391 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-10-26 19:44
Can you identify which objects those leaks point to? The sys module is cleared at shutdown.
msg201452 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2013-10-27 12:25
One example of a leaked object is "hexversion", from sysmodule.c:1615. I tried adding it to sys_deletes[] in import.c, but it did not help. See valgrind.out for other leaks.
msg201464 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2013-10-27 15:32
IMO we have two references to many newly created values in _PySys_Init(): SET_SYS_FROM_STRING("hexversion", PyLong_FromLong(PY_VERSION_HEX)); One from PyLong_FromLong() and the other from PyDict_SetItemString() in the macro. So it may well be that the fix for this issue just exposes the fact that the last reference is no longer reachable if m_copy is cleared.
msg201468 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2013-10-27 15:55
See also 31796b188bec.
msg201469 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-10-27 15:57
Does pulling the Py_DECREF before the "if" fix the leaks?
msg201470 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2013-10-27 16:00
Antoine Pitrou <report@bugs.python.org> wrote: > Does pulling the Py_DECREF before the "if" fix the leaks? Yes, but I get an additional failure in test_capi. Apparently not all values require the DECREF. Got to look on a case by case basis.
msg201476 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2013-10-27 16:36
Valgrind is happy after the recent 5eb00460e6e8.
History
Date User Action Args
2022-04-11 14:57:08 admin set github: 54450
2013-10-27 16:36:30 skrah set messages: +
2013-10-27 16:00:38 skrah set messages: +
2013-10-27 15:57:28 pitrou set messages: +
2013-10-27 15:55:36 skrah set messages: +
2013-10-27 15:32:43 skrah set messages: +
2013-10-27 12:25:12 skrah set messages: +
2013-10-26 19:44:27 pitrou set messages: +
2013-10-26 19:05:53 skrah set files: + valgrind.outnosy: + skrahmessages: +
2013-09-29 17:24:18 pitrou set status: open -> closedresolution: fixedmessages: + stage: resolved
2013-08-10 22:31:51 pitrou set messages: +
2013-08-10 22:31:14 python-dev set messages: +
2013-08-02 18:42:08 pitrou set messages: + stage: resolved -> (no value)
2013-08-02 18:40:41 python-dev set messages: +
2013-08-01 22:50:59 pitrou set messages: +
2013-08-01 22:32:15 pitrou set messages: +
2013-08-01 21:58:10 Arfrever set nosy: + Arfrever
2013-08-01 21:23:29 pitrou set status: closed -> openresolution: fixed -> (no value)messages: +
2013-08-01 20:07:49 pitrou set status: open -> closedresolution: fixedstage: resolved
2013-08-01 20:07:15 python-dev set nosy: + python-devmessages: +
2013-08-01 20:06:02 pitrou set messages: +
2013-08-01 19:54:02 pitrou set versions: + Python 3.4nosy: + pitroumessages: + assignee: loewis -> type: behavior -> resource usage
2010-11-05 23:02:44 nascheme set files: + module_m_copy2.txtmessages: +
2010-10-31 15:55:51 nascheme set messages: +
2010-10-30 03:48:49 nascheme create