Issue 35493: multiprocessing.Pool._worker_handler(): use SIGCHLD to be notified on worker exit (original) (raw)

Created on 2018-12-14 11:27 by vstinner, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 11488 merged pablogsal,2019-01-09 23:40
PR 11488 merged pablogsal,2019-01-09 23:40
PR 11488 merged pablogsal,2019-01-09 23:40
Messages (12)
msg331797 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-12-14 11:27
Currently, multiprocessing.Pool._worker_handler() checks every 100 ms if a worker exited using time.sleep(0.1). It causes a latency if worker exit frequently and the pool has to execute a large number of tasks. Worst case: --- import multiprocessing import time CONCURRENCY = 1 NTASK = 100 def noop(): pass with multiprocessing.Pool(CONCURRENCY, maxtasksperchild=1) as pool: start_time = time.monotonic() results = [pool.apply_async(noop, ()) for _ in range(NTASK)] for result in results: result.get() dt = time.monotonic() - start_time pool.terminate() pool.join() print("Total: %.1f sec" % dt) --- Output: --- Total: 10.2 sec --- The worst case is a pool of 1 process, each worker only executes a single task and the task does nothing (minimize task execution time): the latency is 100 ms per task, which means 10 seconds for 100 tasks. Using SIGCHLD signal to be notified when a worker completes would allow to avoid polling: reduce the latency and reduce CPU usage (the thread doesn't have to be awaken every 100 ms anymore).
msg331799 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-12-14 11:33
asyncio uses SIGCHLD signal to be notified when a child process completes. SafeChildWatcher calls os.waitpid(pid, os.WNOHANG) on each child process, whereas FastChildWatcher() uses os.waitpid(-1, os.WNOHANG).
msg331801 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-12-14 11:33
See also bpo-35479: multiprocessing.Pool.join() always takes at least 100 ms.
msg331803 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2018-12-14 11:39
How do you use SIGCHLD on Windows? There is actually a portable (and robust) solution: use Process.sentinel https://docs.python.org/3/library/multiprocessing.html#multiprocessing.Process.sentinel There is another issue: Pool is currently subclassed by ThreadPool. You'll probably have to make the two implementations diverge a bit.
msg331804 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-12-14 11:48
> How do you use SIGCHLD on Windows? I'm only proposing to use a signal when it's available, on UNIX. So have multiple implementations of the function, depending on the ability to get notified on completion without polling. On Windows, maybe we could use a dedicated thread to set an event once WaitForSingleObject/WaitForMultipleObjects completes? The design of my bpo-35479 change is to replace polling with one or multiple events. Maybe we can use an event to wakeup _worker_handler() when something happens, but have different wants to signal this event. I have to investigate how Process.sentinel can be used here. I might be interesting to use asyncio internally, but I'm not sure if it's possible ;-)
msg331807 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2018-12-14 12:05
Using asyncio internally would be an interesting long-term goal, at least for the process pool version. Perhaps a first step is to find out how to await a multiprocessing Connection or Queue, or make async versions of these classes.
msg331810 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2018-12-14 12:08
> I have to investigate how Process.sentinel can be used here. Look how concurrent.futures uses it: https://github.com/python/cpython/blob/master/Lib/concurrent/futures/process.py#L348 This also means: 1) we could redirect people to ProcessPoolExecutor instead of trying to backport all its features into multiprocessing.Pool 2) we could try to refactor the ProcessPoolExecutor implementation into a common backend for both ProcessPoolExecutor and multiprocessing.Pool
msg333348 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-01-09 22:13
@Antoine Do you think we should start planning one of these long term solutions or we should start trying to use Process.sentinel as a short term solution for this particular issue?
msg333350 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2019-01-09 22:20
If using Process.sentinel looks easy enough, then go for it. Existing users of multiprocessing.Pool will benefit.
msg338106 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-03-16 22:34
New changeset 7c994549dcffd0d9d3bb37475e6374f356e7240e by Pablo Galindo in branch 'master': bpo-35493: Use Process.sentinel instead of sleeping for polling worker status in multiprocessing.Pool (#11488) https://github.com/python/cpython/commit/7c994549dcffd0d9d3bb37475e6374f356e7240e
msg361465 - (view) Author: Stefano Rivera (stefanor) * Date: 2020-02-06 01:20
This change seems to be causing a deadlock in multiprocessing shut-down: bpo-38501
msg364179 - (view) Author: Arkadiusz Miƛkiewicz (arekm) Date: 2020-03-14 15:01
And also https://bugs.python.org/issue38744 on Linux and FreeBSD
History
Date User Action Args
2022-04-11 14:59:09 admin set github: 79674
2021-09-21 22:26:17 vstinner unlink issue35479 dependencies
2020-03-14 15:01:39 arekm set nosy: + arekmmessages: +
2020-02-06 01:20:27 stefanor set nosy: + stefanormessages: +
2019-03-16 22:36:14 pablogsal set keywords:patch, patch, patchstatus: open -> closedresolution: fixedstage: patch review -> resolved
2019-03-16 22:34:26 pablogsal set messages: +
2019-01-09 23:40:36 pablogsal set keywords: + patchstage: patch reviewpull_requests: + <pull%5Frequest11003>
2019-01-09 23:40:27 pablogsal set keywords: + patchstage: (no value)pull_requests: + <pull%5Frequest11002>
2019-01-09 23:40:19 pablogsal set keywords: + patchstage: (no value)pull_requests: + <pull%5Frequest11001>
2019-01-09 22:20:29 pitrou set messages: +
2019-01-09 22:13:08 pablogsal set messages: +
2018-12-14 12:08:59 pitrou set messages: +
2018-12-14 12:05:47 pitrou set messages: +
2018-12-14 11:56:25 vstinner link issue35479 dependencies
2018-12-14 11:48:02 vstinner set messages: +
2018-12-14 11:39:03 pitrou set messages: +
2018-12-14 11:33:41 vstinner set messages: +
2018-12-14 11:33:11 vstinner set messages: +
2018-12-14 11:27:54 vstinner create