[Python-Dev] socketserver ForkingMixin waiting for child processes (original) (raw)

Victor Stinner victor.stinner at gmail.com
Fri Aug 11 09:44:49 EDT 2017


Hi,

I'm working on reducing the failure rate of Python CIs (Travis CI, AppVeyor, buildbots). For that, I'm trying to reduce test side effects using "environment altered" warnings. This week, I worked on support.reap_children() which detects leaked child processes (usually created with os.fork()).

I found a bug in the socketserver module: it waits for child processes completion, but only in non-blocking mode. If a child process takes too long, the server will never reads its exit status and so the server leaks "zombie processes". Leaking processes can increase the memory usage, spawning new processes can fail, etc.

=> http://bugs.python.org/issue31151

I changed the code to call waitpid() in blocking mode on each child process on server_close(), to ensure that all children completed when on server close:

https://github.com/python/cpython/commit/aa8ec34ad52bb3b274ce91169e1bc4a598655049

After pushing my change, I'm not sure anymore if it's a good idea. There is a risk that server_close() blocks if a child is stuck on a socket recv() or send() for some reasons.

Should we relax the code by waiting a few seconds (problem: hardcoded timeouts are always a bad idea), or terminate processes (SIGKILL on UNIX) if they don't complete fast enough?

I don't know which applications use socketserver. How I can test if it breaks code in the wild?

At least, I didn't notice any regression on Python CIs.

Well, maybe the change is ok for the master branch. But I would like your opinion because now I would like to backport the fix to 2.7 and 3.6 branches. It might break some applications.

If we cannot backport such change to 2.7 and 3.6 because it changes the behaviour, I will fix the bug in test_socketserver.py instead.

What do you think?

Victor



More information about the Python-Dev mailing list