Issue 38962: Reference leaks in subinterpreters (original) (raw)

Created on 2019-12-03 19:55 by pablogsal, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 17453 merged pablogsal,2019-12-03 20:58
PR 17454 merged pablogsal,2019-12-03 21:22
PR 17456 closed pablogsal,2019-12-03 21:59
PR 17457 merged pablogsal,2019-12-03 22:12
Messages (16)
msg357759 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-12-03 19:55
All the refleak build bots for master are reporting reference leaks in subinterpreter related tests: https://buildbot.python.org/all/#/builders/126/builds/6/steps/5/logs/stdio https://buildbot.python.org/all/#/builders/144/builds/6/steps/5/logs/stdio https://buildbot.python.org/all/#/builders/206/builds/6 https://buildbot.python.org/all/#/builders/157/builds/6/steps/4/logs/stdio ...... test_atexit leaked [882, 882, 882] references, sum=2646 test_atexit leaked [12, 12, 12] memory blocks, sum=36 5 tests failed again: test__xxsubinterpreters test_atexit test_capi test_httpservers test_threading == Tests result: FAILURE then FAILURE == 401 tests OK. 10 slowest tests: - test_asyncio: 33 min 34 sec - test_concurrent_futures: 17 min 22 sec - test_multiprocessing_spawn: 17 min 6 sec - test_zipfile: 9 min 25 sec - test_multiprocessing_forkserver: 9 min 2 sec - test_multiprocessing_fork: 8 min 43 sec - test_largefile: 7 min 32 sec - test_lib2to3: 7 min 3 sec - test_mailbox: 6 min 27 sec - test_argparse: 5 min 5 sec 5 tests failed: test__xxsubinterpreters test_atexit test_capi test_httpservers test_threading 14 tests skipped: test_devpoll test_gdb test_ioctl test_kqueue test_msilib test_ossaudiodev test_startfile test_tix test_tk test_ttk_guionly test_winconsoleio test_winreg test_winsound test_zipfile64 7 re-run tests: test__xxsubinterpreters test_atexit test_capi test_httpservers test_nntplib test_pty test_threading Bisecting shows the following commit as the culprit: ef5aa9af7c7e493402ac62009e4400aed7c3d54e is the first bad commit commit ef5aa9af7c7e493402ac62009e4400aed7c3d54e Author: Victor Stinner <vstinner@python.org> Date: Wed Nov 20 00:38:03 2019 +0100 bpo-38858: Reorganize pycore_init_types() (GH-17265) * Call _PyLong_Init() and _PyExc_Init() earlier * new_interpreter() reuses pycore_init_types() Python/pylifecycle.c | 31 +++++++++++-------------------- 1 file changed, 11 insertions(+), 20 deletions(-) bisect run success Running * test.test_atexit.SubinterpreterTest.test_callbacks_leak is enough for reproducing the problem.
msg357760 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-12-03 20:47
Actually, that commit has an independent refleak that was fixed (!) regarding the small integer cache. Seems that the actual commit that made the new leak is: commit 2582d46fbcf7bdf86b9cf4016850b8d155267ac6 Author: Victor Stinner <vstinner@python.org> Date: Fri Nov 22 19:24:49 2019 +0100 bpo-38858: new_interpreter() reuses pycore_init_builtins() (GH-17351) new_interpreter() now calls _PyBuiltin_Init() to create the builtins module and calls _PyImport_FixupBuiltin(), rather than using _PyImport_FindBuiltin(tstate, "builtins"). pycore_init_builtins() is now responsible to initialize intepr->builtins_copy: inline _PyImport_Init() and remove this function. Reverting this commit fixes the refleaks in atexit at least
msg357761 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-12-03 20:59
Ok, the problem seems to be that _PyBuiltin_Init() returns a new reference that is not cleared once PyModule_GetDict() and _PyBuiltins_AddExceptions() is called.
msg357762 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-12-03 21:02
Indeed, that fixes the problem in at least 2 modules: ./python -m test test_atexit -R : 0:00:00 load avg: 0.84 Run tests sequentially 0:00:00 load avg: 0.84 [1/1] test_atexit beginning 9 repetitions 123456789 ......... == Tests result: SUCCESS == 1 test OK. Total duration: 1.5 sec Tests result: SUCCESS ./python -m test test_capi -R : 0:00:00 load avg: 1.19 Run tests sequentially 0:00:00 load avg: 1.19 [1/1] test_capi beginning 9 repetitions 123456789 .......... test_capi passed in 2 min 8 sec == Tests result: SUCCESS == 1 test OK. Total duration: 2 min 8 sec Tests result: SUCCESS Although the rest of the tests still leak, so there is something else :(
msg357763 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-12-03 21:22
Opened https://github.com/python/cpython/pull/17454 for the fixing the leak in http_servers
msg357764 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-12-03 21:32
For the subinterpreter other leaks it seems that the commit that introduced the leak is: 7247407c35330f3f6292f1d40606b7ba6afd5700 is the first bad commit commit 7247407c35330f3f6292f1d40606b7ba6afd5700 Author: Victor Stinner <vstinner@python.org> Date: Wed Nov 20 12:25:50 2019 +0100 bpo-36854: Move _PyRuntimeState.gc to PyInterpreterState (GH-17287) * Rename _PyGC_InitializeRuntime() to _PyGC_InitState() * finalize_interp_clear() now also calls _PyGC_Fini() in subinterpreters (clear the GC state). Include/internal/pycore_object.h | 2 +- Include/internal/pycore_pymem.h 2 +- Include/internal/pycore_pystate.h 7 +-- .../2019-11-20-12-01-37.bpo-36854.Zga_md.rst 3 + Modules/gcmodule.c 72 ++++++++++++---------- Objects/object.c 20 +++--- Python/pylifecycle.c 3 +- Python/pystate.c 2 +- 8 files changed, 62 insertions(+), 49 deletions(-) create mode 100644 Misc/NEWS.d/next/Core and Builtins/2019-11-20-12-01-37.bpo-36854.Zga_md.rst bisect run success
msg357765 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-12-03 21:34
Indeed, reverting commit 7247407c35330f3f6292f1d40606b7ba6afd5700 fixes the issues in test_threading
msg357766 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-12-03 21:46
I created https://github.com/python/cpython/pull/17455 to revert commit 7247407c35330f3f6292f1d40606b7ba6afd5700 as it seems that the leak is more complex than the others. If a solution is not found in a couple of days we need to merge it to fix the refleak buildbots.
msg357767 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-12-03 22:02
This patch (which is wrong) fixes the reference issues: diff --git a/Python/pylifecycle.c b/Python/pylifecycle.c index cce4783bc1..c354af18db 100644 --- a/Python/pylifecycle.c +++ b/Python/pylifecycle.c @@ -1276,6 +1276,7 @@ finalize_interp_clear(PyThreadState *tstate) _PyExc_Fini(); } + PyGC_Collect(); _PyGC_Fini(tstate); } This suggests that the per-interpreter gc does not play well with the refleak checker.
msg357768 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-12-03 22:15
Opened PR 17457 for the refleaks of test__xxsubinterpreters although I am not very convinced that we can call the GC at that point.
msg357784 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-12-04 09:29
New changeset 24f5cac7254177a4c9956d680c0a9b6dadd85c6f by Victor Stinner (Pablo Galindo) in branch 'master': bpo-38962: Fix reference leak in test_httpservers (GH-17454) https://github.com/python/cpython/commit/24f5cac7254177a4c9956d680c0a9b6dadd85c6f
msg357788 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-12-04 09:56
""" 5 tests failed again: test__xxsubinterpreters test_atexit test_capi test_httpservers test_threading """ I merged Pablo's fix for test_httpservers. With PR 17457 + PR 17453 (and the test_httpservers fix), the 5 tests don't leak anymore. I am fine with PR 17457 and PR 17453, I added some comments. I would prefer to avoid a revert (PR 17455) if possible ;-) -- Except of PR 17453 which was my fault in my refactoring work, the problem is the work on subinterpreters doesn't introduce regressions but only reveal existing issues. Example with Python compiled in the debug mode (master branch): $ ./python -X showrefcount -c pass [18564 refs, 6496 blocks] Python "leaks" 18564 references at exit. My work is to better isolate subinterpreters from the main interpreter, and so the missing clean up code is revealed by reference leak buildbot checks.
msg357793 - (view) Author: miss-islington (miss-islington) Date: 2019-12-04 11:20
New changeset b96c6b0723b889d3a0c1740bce7f579f33d246f2 by Miss Islington (bot) (Pablo Galindo) in branch 'master': bpo-38962: Fix reference leak in new_interpreter() (GH-17453) https://github.com/python/cpython/commit/b96c6b0723b889d3a0c1740bce7f579f33d246f2
msg357800 - (view) Author: miss-islington (miss-islington) Date: 2019-12-04 11:51
New changeset ac0e1c2694bc199dbd073312145e3c09bee52cc4 by Miss Islington (bot) (Pablo Galindo) in branch 'master': bpo-38962: Fix reference leak in the per-subinterpreter gc (GH-17457) https://github.com/python/cpython/commit/ac0e1c2694bc199dbd073312145e3c09bee52cc4
msg357802 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-12-04 12:12
The 5 tests don't leak anymore, I close thee issue. Thanks Pablo! $ ./python -m test -R 3:3 -j0 test__xxsubinterpreters test_atexit test_capi test_httpservers test_threading (...) Total duration: 1 min 2 sec Tests result: SUCCESS
msg358354 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2019-12-13 21:17
Thanks for all the work on this!
History
Date User Action Args
2022-04-11 14:59:23 admin set github: 83143
2019-12-13 21:17:17 eric.snow set nosy: + eric.snowmessages: +
2019-12-04 12:12:28 vstinner set status: open -> closedresolution: fixedmessages: + stage: patch review -> resolved
2019-12-04 11:51:08 miss-islington set messages: +
2019-12-04 11:20:08 miss-islington set nosy: + miss-islingtonmessages: +
2019-12-04 09:56:02 vstinner set messages: +
2019-12-04 09:29:17 vstinner set messages: +
2019-12-03 22:15:35 pablogsal set messages: +
2019-12-03 22:12:57 pablogsal set pull_requests: + <pull%5Frequest16938>
2019-12-03 22:02:04 pablogsal set messages: +
2019-12-03 21:59:58 pablogsal set pull_requests: + <pull%5Frequest16937>
2019-12-03 21:46:40 pablogsal set messages: +
2019-12-03 21:34:19 pablogsal set messages: +
2019-12-03 21:32:23 pablogsal set messages: +
2019-12-03 21:22:56 pablogsal set messages: +
2019-12-03 21:22:17 pablogsal set stage: needs patch -> patch reviewpull_requests: + <pull%5Frequest16935>
2019-12-03 21:02:51 pablogsal set messages: +
2019-12-03 20:59:09 pablogsal set messages: + stage: patch review -> needs patch
2019-12-03 20:58:02 pablogsal set keywords: + patchstage: needs patch -> patch reviewpull_requests: + <pull%5Frequest16934>
2019-12-03 20:47:39 pablogsal set messages: +
2019-12-03 19:55:50 pablogsal create