(original) (raw)
On Fri, Sep 28, 2018 at 9:27 PM Michael Selik <mike@selik.org> wrote:
On Fri, Sep 28, 2018 at 2:11 PM Sean Harrington <seanharr11@gmail.com> wrote:
\> kwarg on Pool.\_\_init\_\_ called \`expect\_initret\`, that defaults to False. When set to True:
\> Capture the return value of the initializer kwarg of Pool
\> Pass this value to the function being applied, as a kwarg.
The parameter name you chose, "initret" is awkward, because nowhere
else in Python does an initializer return a value. Initializers mutate
an encapsulated scope. For a class \_\_init\_\_, that scope is an
instance's attributes. For a subprocess managed by Pool, that
encapsulated scope is its "globals". I'm using quotes to emphasize
that these "globals" aren't shared.
>> Yes - if you bucket the "initializer" arg of Pool into the "Python initializers" then I see your point here. And yes initializer mutates the global scope of the worker subprocess. Again, my gripe is not with globals. I am looking for the ability to have a clear, explicit flow of data from parent -> child process, without being constrained to using globals.
On Fri, Sep 28, 2018 at 4:39 PM Sean Harrington <seanharr11@gmail.com> wrote:
\> On Fri, Sep 28, 2018 at 6:45 PM Antoine Pitrou <solipsis@pitrou.net> wrote:
\>> 3\. If you don't like globals, you could probably do something like
\>> lazily-initialize the resource when a function needing it is executed
\>
\> if initializing the resource is expensive, we only want to do this ONE time per worker process.
We must have a different concept of "lazily-initialize". I understood
Antoine's suggestion to be a one-time initialize per worker process.
>> See my response to Anotoine earlier. I missed the point made. This is a valid solution to the problem of "initializing objects after a worker has been forked", but fails to address the "create big object in parent, pass to each worker".
On Fri, Sep 28, 2018 at 4:39 PM Sean Harrington <seanharr11@gmail.com> wrote:
\> My simple argument is that the developer should not be constrained to make the objects passed globally available in the process, as this MAY break encapsulation for large projects.
I could imagine someone switching from Pool to ThreadPool and getting
into trouble, but in my mind using threads is caveat emptor. Are you
worried about breaking encapsulation in a different scenario?
>> Without a specific example on-hand, you could imagine a tree of function calls that occur in the worker process (even newly created objects), that should not necessarily have access to objects passed from parent -> worker. In every case given the current implementation, they will.