Issue 33997: multiprocessing Pool hangs in terminate() (original) (raw)

Created on 2018-06-29 12:37 by 3mb3dw0rk5, last changed 2022-04-11 14:59 by admin.

Files
File name Uploaded Description Edit
debug.trace 3mb3dw0rk5,2018-06-29 18:27
multiprocessing_hangs.py remi.lapeyre,2019-08-05 14:39
test_multiprocessing.py remi.lapeyre,2019-08-05 14:41
pool_terminate_snoop.txt alexmojaki,2020-01-27 13:11
Pull Requests
URL Status Linked Edit
PR 8009 open python-dev,2018-06-29 12:40
Messages (6)
msg320713 - (view) Author: Erik Wolf (3mb3dw0rk5) * Date: 2018-06-29 12:37
The terminate() method of multiprocessing.Pool hangs sporadically. I could track this issue down to the fact that _handle_results() hangs in the outqueue-cleanup. poll() returned True but get() actually hangs endlessly never returning any data.
msg320726 - (view) Author: Erik Wolf (3mb3dw0rk5) * Date: 2018-06-29 18:27
To help get an idea of the racing condition I created a trace with several additional debug outputs in pool.py and connection.py "======= Creating new pool =======" marks a new start of a pool using the Pool.imap() function. The iteration is terminated at some time and a new pool is created with a new call to imap(). The last log entry shows the location of the last call which currently hangs endlessly. I am not a multithreading/winapi expert so I could not fix the actual issue with _winapi.WaitForMultipleObjects but the fix in the PR actually ignores preceding issues when polling more data than actually needed for the two sentinels. Remark: I have also seen this issue in linux with the same application but was not able to debug it so far.
msg349051 - (view) Author: Rémi Lapeyre (remi.lapeyre) * Date: 2019-08-05 14:39
Hi, I got bit by this bug last week, I wrote an example that reproduce the basic idea of our program main loop and it hangs - around 20% of the time with a release build of Python 3.7.4 - around 6% of the time with a debug build of Python 3.7, 3.8 and 3.9 With some of our inputs, it hangs nearly all the time but I cannot post them here. I tested PR 8009 and it solves the issue. It seems to me that it is an appropriate fix for this.
msg349052 - (view) Author: Rémi Lapeyre (remi.lapeyre) * Date: 2019-08-05 14:41
Removed Python 3.6 as it is in security fixes now.
msg360761 - (view) Author: Alex Hall (alexmojaki) Date: 2020-01-27 13:11
I'm also experiencing hanging on terminate. I haven't made a debug build or anything but it's happening to me consistently on 3.8, although I haven't managed to create a small example to reproduce. Replacing pool.py with https://raw.githubusercontent.com/python/cpython/5f6a05bf5b3f7e3c1d805b3bbd8c5ad18f26d933/Lib/multiprocessing/pool.py (from the PR) did not help. So maybe what I'm experiencing is unrelated. It gets stuck on `inqueue._rlock.acquire()` in `Pool._help_stuff_finish`. I've attached debugging info from snoop, maybe that will help.
msg360762 - (view) Author: Alex Hall (alexmojaki) Date: 2020-01-27 13:24
Sorry, I should have looked around more, I think my problem is https://bugs.python.org/issue22393
History
Date User Action Args
2022-04-11 14:59:02 admin set github: 78178
2020-09-01 09:45:02 vstinner set nosy: - vstinner
2020-08-31 15:09:40 paul.madden set nosy: + paul.madden
2020-01-27 13:24:55 alexmojaki set messages: +
2020-01-27 13:11:56 alexmojaki set files: + pool_terminate_snoop.txtnosy: + alexmojakimessages: +
2019-08-05 14:41:34 remi.lapeyre set components: + Library (Lib), - Windows
2019-08-05 14:41:20 remi.lapeyre set files: + test_multiprocessing.pymessages: + versions: + Python 3.8, Python 3.9, - Python 3.6
2019-08-05 14:39:45 remi.lapeyre set files: + multiprocessing_hangs.pymessages: +
2019-08-04 10:49:09 remi.lapeyre set nosy: + remi.lapeyre
2018-06-30 14:08:33 steve.dower set nosy: + pitrou, davin
2018-06-29 18:27:44 3mb3dw0rk5 set files: + debug.tracemessages: +
2018-06-29 12:50:01 vstinner set nosy: + vstinner
2018-06-29 12:40:09 python-dev set keywords: + patchstage: patch reviewpull_requests: + <pull%5Frequest7614>
2018-06-29 12:37:53 3mb3dw0rk5 create