Running python python_thread_bug.py -j4 often results in one of the threads failing to start until another thread finishes. The bug appears to be that subprocess's pipe_cloexec function is racy: if another thread forks between os.pipe() and _set_cloexec_flag, then the resulting process could hang on to the write end of the pipe. That will cause the Popen call that got rudely interrupted to wait until the whole resulting process tree dies.
Hmm. I thought someone had already reported this, but I can't find an issue. I suspect it is fixed in 3.4 and it may not be practical to fix it in earlier versions.
FWIW, sticking a mutex in Popen.__init__ (wrapping the whole thing) seems to work around this issue (for programs that aren't using multiprocessing or fork, for example). This might be a good-enough fix and be safe enough to stick in the standard library.
A pointer in the 2.7 subprocess docs to subprocess32 does seem like a good idea, its what i tell everyone to do anyways. :) If you've got a patch for this in 2.7 feel free to add it here and I can take a look. leaving this open as a reminder to me to update the docs at the very least.