Issue 33997: multiprocessing Pool hangs in terminate() (original) (raw)

Created on 2018-06-29 12:37 by 3mb3dw0rk5, last changed 2022-04-11 14:59 by admin.

Files
File name	Uploaded	Description	Edit
debug.trace	3mb3dw0rk5,2018-06-29 18:27
multiprocessing_hangs.py	remi.lapeyre,2019-08-05 14:39
test_multiprocessing.py	remi.lapeyre,2019-08-05 14:41
pool_terminate_snoop.txt	alexmojaki,2020-01-27 13:11

Pull Requests
URL	Status	Linked	Edit
PR 8009	open	python-dev,2018-06-29 12:40

Messages (6)
msg320713 - (view)	Author: Erik Wolf (3mb3dw0rk5) *	Date: 2018-06-29 12:37
The terminate() method of multiprocessing.Pool hangs sporadically. I could track this issue down to the fact that _handle_results() hangs in the outqueue-cleanup. poll() returned True but get() actually hangs endlessly never returning any data.
msg320726 - (view)	Author: Erik Wolf (3mb3dw0rk5) *	Date: 2018-06-29 18:27
To help get an idea of the racing condition I created a trace with several additional debug outputs in pool.py and connection.py "======= Creating new pool =======" marks a new start of a pool using the Pool.imap() function. The iteration is terminated at some time and a new pool is created with a new call to imap(). The last log entry shows the location of the last call which currently hangs endlessly. I am not a multithreading/winapi expert so I could not fix the actual issue with _winapi.WaitForMultipleObjects but the fix in the PR actually ignores preceding issues when polling more data than actually needed for the two sentinels. Remark: I have also seen this issue in linux with the same application but was not able to debug it so far.
msg349051 - (view)	Author: Rémi Lapeyre (remi.lapeyre) *	Date: 2019-08-05 14:39
Hi, I got bit by this bug last week, I wrote an example that reproduce the basic idea of our program main loop and it hangs - around 20% of the time with a release build of Python 3.7.4 - around 6% of the time with a debug build of Python 3.7, 3.8 and 3.9 With some of our inputs, it hangs nearly all the time but I cannot post them here. I tested PR 8009 and it solves the issue. It seems to me that it is an appropriate fix for this.
msg349052 - (view)	Author: Rémi Lapeyre (remi.lapeyre) *	Date: 2019-08-05 14:41
Removed Python 3.6 as it is in security fixes now.
msg360761 - (view)	Author: Alex Hall (alexmojaki)	Date: 2020-01-27 13:11
I'm also experiencing hanging on terminate. I haven't made a debug build or anything but it's happening to me consistently on 3.8, although I haven't managed to create a small example to reproduce. Replacing pool.py with https://raw.githubusercontent.com/python/cpython/5f6a05bf5b3f7e3c1d805b3bbd8c5ad18f26d933/Lib/multiprocessing/pool.py (from the PR) did not help. So maybe what I'm experiencing is unrelated. It gets stuck on `inqueue._rlock.acquire()` in `Pool._help_stuff_finish`. I've attached debugging info from snoop, maybe that will help.
msg360762 - (view)	Author: Alex Hall (alexmojaki)	Date: 2020-01-27 13:24
Sorry, I should have looked around more, I think my problem is https://bugs.python.org/issue22393

History
Date	User	Action	Args
2022-04-11 14:59:02	admin	set	github: 78178
2020-09-01 09:45:02	vstinner	set	nosy: - vstinner
2020-08-31 15:09:40	paul.madden	set	nosy: + paul.madden
2020-01-27 13:24:55	alexmojaki	set	messages: +
2020-01-27 13:11:56	alexmojaki	set	files: + pool_terminate_snoop.txtnosy: + alexmojakimessages: +
2019-08-05 14:41:34	remi.lapeyre	set	components: + Library (Lib), - Windows
2019-08-05 14:41:20	remi.lapeyre	set	files: + test_multiprocessing.pymessages: + versions: + Python 3.8, Python 3.9, - Python 3.6
2019-08-05 14:39:45	remi.lapeyre	set	files: + multiprocessing_hangs.pymessages: +
2019-08-04 10:49:09	remi.lapeyre	set	nosy: + remi.lapeyre
2018-06-30 14:08:33	steve.dower	set	nosy: + pitrou, davin
2018-06-29 18:27:44	3mb3dw0rk5	set	files: + debug.tracemessages: +
2018-06-29 12:50:01	vstinner	set	nosy: + vstinner
2018-06-29 12:40:09	python-dev	set	keywords: + patchstage: patch reviewpull_requests: + <pull%5Frequest7614>
2018-06-29 12:37:53	3mb3dw0rk5	create