[Python-Dev] bpo-34837: Multiprocessing.Pool API Extension (original) (raw)
[Python-Dev] bpo-34837: Multiprocessing.Pool API Extension - Pass Data to Workers w/o Globals
Sean Harrington seanharr11 at gmail.com
Sat Sep 29 08:23:49 EDT 2018
- Previous message (by thread): [Python-Dev] bpo-34837: Multiprocessing.Pool API Extension - Pass Data to Workers w/o Globals
- Next message (by thread): [Python-Dev] bpo-34837: Multiprocessing.Pool API Extension - Pass Data to Workers w/o Globals
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Fri, Sep 28, 2018 at 9:27 PM Michael Selik <mike at selik.org> wrote:
On Fri, Sep 28, 2018 at 2:11 PM Sean Harrington <seanharr11 at gmail.com> wrote: > kwarg on Pool.init called
expectinitret
, that defaults to False. When set to True: > Capture the return value of the initializer kwarg of Pool > Pass this value to the function being applied, as a kwarg.The parameter name you chose, "initret" is awkward, because nowhere else in Python does an initializer return a value. Initializers mutate an encapsulated scope. For a class init, that scope is an instance's attributes. For a subprocess managed by Pool, that encapsulated scope is its "globals". I'm using quotes to emphasize that these "globals" aren't shared.
Yes - if you bucket the "initializer" arg of Pool into the "Python initializers" then I see your point here. And yes initializer mutates the global scope of the worker subprocess. Again, my gripe is not with globals. I am looking for the ability to have a clear, explicit flow of data from parent -> child process, without being constrained to using globals.
On Fri, Sep 28, 2018 at 4:39 PM Sean Harrington <seanharr11 at gmail.com> wrote: > On Fri, Sep 28, 2018 at 6:45 PM Antoine Pitrou <solipsis at pitrou.net> wrote: >> 3. If you don't like globals, you could probably do something like >> lazily-initialize the resource when a function needing it is executed > > if initializing the resource is expensive, we only want to do this ONE time per worker process. We must have a different concept of "lazily-initialize". I understood Antoine's suggestion to be a one-time initialize per worker process.
See my response to Anotoine earlier. I missed the point made. This is a valid solution to the problem of "initializing objects after a worker has been forked", but fails to address the "create big object in parent, pass to each worker".
On Fri, Sep 28, 2018 at 4:39 PM Sean Harrington <seanharr11 at gmail.com> wrote: > My simple argument is that the developer should not be constrained to make the objects passed globally available in the process, as this MAY break encapsulation for large projects. I could imagine someone switching from Pool to ThreadPool and getting into trouble, but in my mind using threads is caveat emptor. Are you worried about breaking encapsulation in a different scenario?
Without a specific example on-hand, you could imagine a tree of function calls that occur in the worker process (even newly created objects), that should not necessarily have access to objects passed from parent -> worker. In every case given the current implementation, they will. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20180929/79ab1214/attachment.html>
- Previous message (by thread): [Python-Dev] bpo-34837: Multiprocessing.Pool API Extension - Pass Data to Workers w/o Globals
- Next message (by thread): [Python-Dev] bpo-34837: Multiprocessing.Pool API Extension - Pass Data to Workers w/o Globals
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]