[Python-Dev] bpo-34837: Multiprocessing.Pool API Extension (original) (raw)
[Python-Dev] bpo-34837: Multiprocessing.Pool API Extension - Pass Data to Workers w/o Globals
Antoine Pitrou antoine at python.org
Fri Oct 12 09:24:18 EDT 2018
- Previous message (by thread): [Python-Dev] bpo-34837: Multiprocessing.Pool API Extension - Pass Data to Workers w/o Globals
- Next message (by thread): [Python-Dev] bpo-34837: Multiprocessing.Pool API Extension - Pass Data to Workers w/o Globals
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Le 12/10/2018 à 15:17, Sean Harrington a écrit :
The implementation details need to be flushed out, but agnostic of these, do you believe this a valid solution to the initial problem? Do you also see it as a beneficial optimization to Pool, given that we don't need to store funcs/bound-methods/partials on the tasks themselves?
I'm not sure, TBH. I also think it may be better to leave this to higher levels (for example Dask will intelligently distribute data on workers and let you work with a kind of proxy object in the main process, transfering data only when necessary).
The latter concern about "what happens if
self
changed value in the parent" is the same concern as "what happens iffunc
changes in the parent?" given the current implementation. This is an assumption that is currently made with Pool.mapasync(func, ls). If "func" changes in the parent, there is no communication with the child. So one just needs to be aware that calling "mapasync(self.func, ls)" while the state of "self" is changing, will not communicate changes to each worker. The state is frozen when Pool.map is called, just as is the case now.
If you cache "self" between pool.map calls, then the question is not "what happens if self changes during a map() call" but "what happens if self changes between two map() calls"? While the former is intuitively undefined, current users would expect the latter to have a clear answer, which is: the latest version of self when map() is called is taken into account.
Regards
Antoine.
- Previous message (by thread): [Python-Dev] bpo-34837: Multiprocessing.Pool API Extension - Pass Data to Workers w/o Globals
- Next message (by thread): [Python-Dev] bpo-34837: Multiprocessing.Pool API Extension - Pass Data to Workers w/o Globals
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]