Issue 26633: multiprocessing behavior combining daemon with non-daemon children inconsistent with threading (original) (raw)

Unclear if this is just unclear docs, or incorrect behavior.

Per this Stack Overflow question ( https://stackoverflow.com/questions/36191447/why-doesnt-the-daemon-program-exit-without-join ), you get some rather odd behavior when you have both daemon and non-daemon child processes. In the case described, the following steps occur:

A daemon Process is launched which prints a message, waits two seconds, then prints a second message
The main process sleeps one second
A non-daemon process is launched which behaves the same as the daemon process, but sleeps six seconds before the second message.
The main process completes

The expected behavior (to my mind and the questioner on SO) is that since there is a non-daemon process running, the "process family" should stay alive until the non-daemon process finishes, which gives the daemon process time to wake up and print its second message (five seconds before the non-daemon process wakes to finish its "work"). But in fact, the atexit function used for cleanup in multiprocessing first calls .terminate() on all daemon children before join-ing all children. So the moment the main process completes, it immediately terminates the daemon child, even though the "process family" is still alive.

This seems counter-intuitive; in the threading case, which multiprocessing is supposed to emulate, all non-daemon threads are equivalent, so no daemon threads are cleaned until the last non-daemon thread exits. To match the threading behavior, it seems like the cleanup code should first join all the non-daemon children, then terminate the daemon children, then join the daemon children.

This would change the code here ( https://hg.python.org/cpython/file/3.5/Lib/multiprocessing/util.py#l303 ) from:

        for p in active_children():
            if p.daemon:
                info('calling terminate() for daemon %s', p.name)
                p._popen.terminate()

        for p in active_children():
            info('calling join() for process %s', p.name)
            p.join()

to:

        # Wait on non-daemons first
        for p in active_children():
            info('calling join() for process %s', p.name)
            if not p.daemon:
                p.join()

        # Terminate and clean up daemons now that non-daemons done
        for p in active_children():
            if p.daemon:
                info('calling terminate() for daemon %s', p.name)
                p._popen.terminate()
                info('calling join() for process %s', p.name)
                p.join()

I've attached repro code to demonstrate; using multiprocessing, the daemon never prints its exiting message, while switching to multiprocessing.dummy (backed by threading) correctly prints the exit message.