msg158049 - (view) |
Author: Richard Oudkerk (sbt) *  |
Date: 2012-04-11 16:32 |
When running test_multiprocessing on Linux I occasionally see a stream of errors caused by ignored weakref callbacks: Exception AssertionError: AssertionError() in <Finalize object, dead> ignored These do not cause the unittests to fail. Finalizers from the parent process are supposed to be cleared after the fork. But if a garbage collection before that then Finalizer callbacks can be run in the "wrong" process. Disabling gc during fork seems to prevent the errors. Or maybe the Finalizer should record the pid of the process which created it and only invoke the callback if it matches the current pid. (Compare Issure 1336 conscerning subprocess.) |
|
|
msg158052 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-04-11 16:40 |
> Disabling gc during fork seems to prevent the errors. Sounds reasonable to me. |
|
|
msg158064 - (view) |
Author: Richard Oudkerk (sbt) *  |
Date: 2012-04-11 17:50 |
Patch to disable gc. |
|
|
msg158071 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-04-11 20:27 |
Shouldn't there be a try..finally, in case os.fork() fails? |
|
|
msg158080 - (view) |
Author: Charles-François Natali (neologix) *  |
Date: 2012-04-11 21:40 |
Hmm... I don't really like disabling GC, because it has a process-wide side effect, and hence isn't thread-safe: if another thread forks() or creates a subprocess right at the wrong time, it could end up with the GC disabled for good... |
|
|
msg158081 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-04-11 21:46 |
> Hmm... > I don't really like disabling GC, because it has a process-wide side > effect, and hence isn't thread-safe: if another thread forks() or > creates a subprocess right at the wrong time, it could end up with the > GC disabled for good... That's a problem indeed. Perhaps we need a global "fork lock" shared between subprocess and multiprocessing? |
|
|
msg158087 - (view) |
Author: Richard Oudkerk (sbt) *  |
Date: 2012-04-11 23:07 |
> That's a problem indeed. Perhaps we need a global "fork lock" shared > between subprocess and multiprocessing? I did an atfork patch which included a (recursive) fork lock. See http://bugs.python.org/review/6721/show The patch included changes to multiprocessing and subprocess. (Being able to acquire the lock when doing fd manipulation is quite useful. For instance, the creation of Process.sentinel currently has a race which can mean than another process inherits the write end of the pipe. That would cause Process.join() to wait till both processes terminate.) Actually, for Finalizers I think it would be easier to just record and check the pid. |
|
|
msg158113 - (view) |
Author: Charles-François Natali (neologix) *  |
Date: 2012-04-12 08:01 |
>> That's a problem indeed. Perhaps we need a global "fork lock" shared >> between subprocess and multiprocessing? > > I did an atfork patch which included a (recursive) fork lock. See > > http://bugs.python.org/review/6721/show > > The patch included changes to multiprocessing and subprocess. (Being able to acquire the lock when doing fd manipulation is quite useful. For instance, the creation of Process.sentinel currently has a race which can mean than another process inherits the write end of the pipe. That would cause Process.join() to wait till both processes terminate.) Indeed, I had a look and it looked good. I just had a couple minor comments, I'll try to get back to this later today, or by the end of the week. > Actually, for Finalizers I think it would be easier to just record and check the pid. I'd prefer this too. |
|
|
msg158159 - (view) |
Author: Richard Oudkerk (sbt) *  |
Date: 2012-04-12 18:22 |
Alternative patch which records pid when Finalize object is created. The callback does nothing if recorded pid does not match os.getpid(). |
|
|
msg158164 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-04-12 19:15 |
But what if Finalize is used to cleanup a resource that gets duplicated in children, like a file descriptor? See e.g. forking.py, line 137 (in Popen.__init__()) or heap.py, line 244 (BufferWrapper.__init__()). |
|
|
msg158180 - (view) |
Author: Richard Oudkerk (sbt) *  |
Date: 2012-04-12 22:30 |
> But what if Finalize is used to cleanup a resource that gets > duplicated in children, like a file descriptor? > See e.g. forking.py, line 137 (in Popen.__init__()) > or heap.py, line 244 (BufferWrapper.__init__()). This was how Finalize objects already acted (or were supposed to). In the case of BufferWrapper this is intended. BufferWrapper objects do not have reference counting semantics. Instead the memory is deallocated when the object is garbage collected in the process that created it. (Garbage collection in a child process should *not* invalidate memory owned by the parent process.) You can prevent the parent process from garbage collecting the object too early by following the advice below from the documentation: Explicitly pass resources to child processes On Unix a child process can make use of a shared resource created in a parent process using a global resource. However, it is better to pass the object as an argument to the constructor for the child process. Apart from making the code (potentially) compatible with Windows this also ensures that as long as the child process is still alive the object will not be garbage collected in the parent process. This might be important if some resource is freed when the object is garbage collected in the parent process. In the case of the sentinel in Popen.__init__(), it is harmless if this end of the pipe gets accidentally inherited by another process. Since Process does not have a closefds argument like subprocess.Popen unintended leaking happens all the time. And even without the pid check, I think this finalizer would very rarely be triggered in a child process. (A Process object can only be garbage collected after it has been joined, and it can only be joined by it parent process.) |
|
|
msg158567 - (view) |
Author: Charles-François Natali (neologix) *  |
Date: 2012-04-17 20:17 |
> Alternative patch which records pid when Finalize object is created. Looks good to me. Note that it could eventually be rewritten to use an atfork() module, when we finally merge your patch for this other issue :-) |
|
|
msg161580 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2012-05-25 12:58 |
New changeset 59567c117b0e by Richard Oudkerk in branch 'default': Issue #14548: Make multiprocessing finalizers check pid before running http://hg.python.org/cpython/rev/59567c117b0e |
|
|
msg208868 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2014-01-23 00:13 |
New changeset 751371dd4d1c by Richard Oudkerk in branch '2.7': Issue #14548: Make multiprocessing finalizers check pid before http://hg.python.org/cpython/rev/751371dd4d1c |
|
|