Issue 17273: Pool methods can only be used by parent process. (original) (raw)

Created on 2013-02-22 06:20 by abn, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
pool_forking.py abn,2013-02-22 06:20 Example script highlighting the issue
Messages (8)
msg182647 - (view) Author: Arun Babu Neelicattu (abn) * Date: 2013-02-22 06:20
The task/worker handler threads in the multiprocessing.pool.Pool class are (in accordance to posix standards) not copied over when the process containing the pool is forked. This leads to a situation where the Pool keeps receiving tasks but the tasks never get handled. This could potentially lead to deadlocks if AsyncResult.wait() is called. Not sure if this should be considered as a bug, or an invalid use case. However, this becomes a problem when importing modules that use pools and the main code uses multiprocessing too. [BAD] Workaround: Reassigning Pool._task_handler to a new instance of threading.Thread after the fork seems to work in the case highlighted in the example. Environment: Fedora 18 Linux 3.7.8-202.fc18.x86_64 #1 SMP Fri Feb 15 17:33:07 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux python3-3.3.0-1.fc18.x86_64 An example of this issue is shown below: from multiprocessing import Pool, Process def t2(): # We expect the pool to handle this print('t2: Hello!') pool = Pool() def t1(): # We assign a task to the pool pool.apply_async(t2) print('t1: Hello!') if __name__ == '__main__': # Process() forks the main process containing the pool Process(target=t1).start()
msg182656 - (view) Author: Arun Babu Neelicattu (abn) * Date: 2013-02-22 07:35
I should have mentioned this too, [GOOD] Workaround: Probably the 'correct' way to achieve what is required in the example, could be to use a managed pool. pool = multiprocessing.Manager().Pool()
msg182659 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-02-22 09:50
A pool should only be used by the process that created it (unless you use a managed pool). If you are creating long lived processes then you could create a new pool on demand. For example (untested) pool_pid = (None, None) def get_pool(): global pool_pid if os.getpid() != pool_pid[1]: pool_pid = (Pool(), os.getpid()) return pool_pid[0]
msg182697 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2013-02-22 20:23
Arun, to call this a bug, you need to demonstrate a conflict between behavior and doc, and I do not see that you have. Richard, are you suggesting that we close this, or do you see an actionable issue? (a plausible patch to the repository?)
msg182701 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-02-22 21:45
> Richard, are you suggesting that we close this, or do you see an > actionable issue? (a plausible patch to the repository?) I skimmed the documentation and could not see that this restriction has been documented. So I think a documentation patch would be a good idea.
msg182707 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2013-02-23 01:43
Arun, can you suggest a sentence to add and where to add it?
msg182852 - (view) Author: Arun Babu Neelicattu (abn) * Date: 2013-02-24 04:24
Terry, I think the best place to make a note of this would be at [1,2]. As for what should be noted, something along the lines of what Richard mentioned should suffice. "A pool should only be used by the process that created it (unless you use a managed pool)." I am not certain what the best way to phrase this would be, but it would also be helpful to note that this will cause unexpected behavior if a script imports a module that uses a Pool and forks (ie. uses Process() or another Pool()). This is how I bumped into this issue. Hope this helps. [1] http://docs.python.org/2/library/multiprocessing.html#using-a-pool-of-workers [2] http://docs.python.org/3/library/multiprocessing.html#using-a-pool-of-workers
msg192187 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-07-02 11:42
New changeset 389788ba6bcb by Richard Oudkerk in branch '2.7': Issue #17273: Clarify that pool methods can only be used by parent process. http://hg.python.org/cpython/rev/389788ba6bcb New changeset 57fe80fda9be by Richard Oudkerk in branch '3.3': Issue #17273: Clarify that pool methods can only be used by parent process. http://hg.python.org/cpython/rev/57fe80fda9be New changeset 7ccf3d36ad13 by Richard Oudkerk in branch 'default': Issue #17273: Clarify that pool methods can only be used by parent process. http://hg.python.org/cpython/rev/7ccf3d36ad13
History
Date User Action Args
2022-04-11 14:57:42 admin set github: 61475
2013-07-02 11:45:39 sbt set status: open -> closedversions: + Python 2.7, Python 3.4type: behavior -> title: multiprocessing.pool.Pool task/worker handlers are not fork safe -> Pool methods can only be used by parent process.resolution: fixedstage: resolved
2013-07-02 11:42:32 python-dev set nosy: + python-devmessages: +
2013-02-24 04:24:56 abn set messages: +
2013-02-23 01:43:34 terry.reedy set nosy: + docs@pythonmessages: + assignee: docs@pythoncomponents: + Documentation, - Library (Lib)
2013-02-22 21:45:09 sbt set messages: +
2013-02-22 20:23:20 terry.reedy set nosy: + terry.reedymessages: +
2013-02-22 09:50:12 sbt set messages: +
2013-02-22 07:35:46 abn set messages: +
2013-02-22 07:03:11 abn set nosy: + jnoller, sbt
2013-02-22 06:20:53 abn create