Issue 35479: multiprocessing.Pool.join() always takes at least 100 ms (original) (raw)

Created on 2018-12-13 00:36 by vstinner, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL	Status	Linked	Edit
PR 11136	closed	vstinner,2018-12-13 00:53
PR 10564	closed	thatiparthy,2019-09-14 05:40

Messages (6)
msg331726 - (view)	Author: STINNER Victor (vstinner) *	Date: 2018-12-13 00:36
The join() method of multiprocessing.Pool calls self._worker_handler.join(): it's a thread running _handle_workers(). The core of this thread function is: while thread._state == RUN or (pool._cache and thread._state != TERMINATE): pool._maintain_pool() time.sleep(0.1) I understand that the delay of 100 ms is used to check regularly the stop condition changed. This sleep causes a mandatory delay of 100 ms on Pool.join().
msg331727 - (view)	Author: STINNER Victor (vstinner) *	Date: 2018-12-13 01:14
Attached PR 11136 modify _worker_handler() loop to wait on threading.Event events, so Pool.join() completes as soon as possible. Example: --- import multiprocessing import time def the_test(): start_time = time.monotonic() pool = multiprocessing.Pool(1) res = pool.apply_async(int, ("1",)) pool.close() #pool.terminate() pool.join() dt = time.monotonic() - start_time print("%.3f sec" % dt) the_test() --- Minimum timing with _handle_results() using: * current code (time.sleep(0.1)): min 0.132 sec * time.sleep(1.0): min 1.033 sec * my PR using events (wait(0.1)): min 0.033 sec Currently, join() minimum timing depends on _handle_results() sleep() duration (100 ms). With my PR, it completes as soon as possible: when state change and/or when a result is set. My PR still requires an hardcoded delay of 100 ms to workaround bpo-35478 bug: results are never set if the pool is terminated.
msg331794 - (view)	Author: STINNER Victor (vstinner) *	Date: 2018-12-14 11:06
My PR 11136 doesn't work: _maintain_pool() should be called frequently to check when a worker completed. Polling worker exit status seems inefficient :-( asyncio uses SIGCHLD signal to be notified when a child process completes. SafeChildWatcher calls os.waitpid(pid, os.WNOHANG) on each child process, whereas FastChildWatcher() uses os.waitpid(-1, os.WNOHANG).
msg331800 - (view)	Author: STINNER Victor (vstinner) *	Date: 2018-12-14 11:33
> My PR 11136 doesn't work: _maintain_pool() should be called frequently to check when a worker completed. Polling worker exit status seems inefficient :-( I created bpo-35493: "multiprocessing.Pool._worker_handler(): use SIGCHLD to be notified on worker exit".
msg331805 - (view)	Author: STINNER Victor (vstinner) *	Date: 2018-12-14 11:56
_worker_handler has two issues: * It polls the worker status every status every 100 ms: I created bpo-35493 to investigate how to avoid that * After close() or terminate() has been called, it loops until self._cache is empty. I would like to use result.wait(), but a result never completes after terminate(): I created bpo-35478 to see if tasks can be unblocked in that case (to ensure that result.wait() completes after terminate().
msg402401 - (view)	Author: STINNER Victor (vstinner) *	Date: 2021-09-21 22:26
Nobody managed to find a solution in 3 years. I close the issue.

History
Date	User	Action	Args
2022-04-11 14:59:09	admin	set	github: 79660
2021-09-21 22:26:17	vstinner	set	status: open -> closedmessages: + dependencies: - multiprocessing: ApplyResult.get() hangs if the pool is terminated, multiprocessing.Pool._worker_handler(): use SIGCHLD to be notified on worker exitresolution: out of datestage: patch review -> resolved
2019-09-14 05:40:45	thatiparthy	set	pull_requests: + <pull%5Frequest15744>
2018-12-14 11:56:25	vstinner	set	dependencies: + multiprocessing: ApplyResult.get() hangs if the pool is terminated, multiprocessing.Pool._worker_handler(): use SIGCHLD to be notified on worker exitmessages: +
2018-12-14 11:33:27	vstinner	set	messages: +
2018-12-14 11:06:45	vstinner	set	messages: +
2018-12-13 01:14:17	vstinner	set	messages: +
2018-12-13 00:53:48	vstinner	set	keywords: + patchstage: patch reviewpull_requests: + <pull%5Frequest10368>
2018-12-13 00:36:54	vstinner	create