msg159230 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2012-04-24 23:01 |
[233/364] test_multiprocessing ... [265/364] test_typechecks [266/364] test_socket Timeout (1:00:00)! Thread 0x0000000807235000: File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/socket.py", line 135 in accept File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/multiprocessing/connection.py", line 595 in accept File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/multiprocessing/connection.py", line 469 in accept File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/multiprocessing/reduction.py", line 256 in _serve File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/threading.py", line 592 in run File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/threading.py", line 635 in _bootstrap_inner File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/threading.py", line 612 in _bootstrap Thread 0x0000000801407400: File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/test/test_socket.py", line 1208 in check_sendall_interrupted File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/test/test_socket.py", line 1219 in test_sendall_interrupted File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/unittest/case.py", line 385 in _executeTestPart File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/unittest/case.py", line 440 in run File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/unittest/case.py", line 492 in __call__ File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/unittest/suite.py", line 105 in run File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/unittest/suite.py", line 67 in __call__ File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/unittest/suite.py", line 105 in run File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/unittest/suite.py", line 67 in __call__ File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/unittest/runner.py", line 168 in run File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/test/support.py", line 1333 in _run_suite File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/test/support.py", line 1367 in run_unittest File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/test/test_socket.py", line 4813 in test_main File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/test/regrtest.py", line 1237 in runtest_inner File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/test/regrtest.py", line 907 in runtest File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/test/regrtest.py", line 710 in main File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/test/__main__.py", line 13 in File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/runpy.py", line 73 in _run_code File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/runpy.py", line 160 in _run_module_as_main *** Error code 1 http://www.python.org/dev/buildbot/all/builders/AMD64%20FreeBSD%209.0%203.x/builds/2339/steps/test/logs/stdio |
|
|
msg159231 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2012-04-24 23:26 |
There was a similar issue: #11753, but it was a bug in the faulthandler module. Here it looks like a bug in TestSocketSharing of test_socket which uses multiprocessing. |
|
|
msg159232 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-04-24 23:34 |
Ah, this is because of the new daemon thread in ResourceSharer. That thread is never stopped and could receive signals while tests expect them to be delivered to the main thread. Either we add a (private?) facility to stop that thread, or we block signal delivery in that thread using the signal module's pthread_sigmask. What do you think? |
|
|
msg159233 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-04-24 23:36 |
The pthread_sigmask() solution would allow the use of multiprocessing all the while keeping deterministic signal delivery. |
|
|
msg159241 - (view) |
Author: Richard Oudkerk (sbt) *  |
Date: 2012-04-25 00:18 |
This patch adds a ResourceSharer.stop() method. This is called from tearDownClass() in the unittest. |
|
|
msg159267 - (view) |
Author: Richard Oudkerk (sbt) *  |
Date: 2012-04-25 11:37 |
New version of patch which does signal.pthread_sigmask(signal.SIG_BLOCK, range(1, signal.NSIG)) in the thread (is that right?). It also uses a timeout when trying to join the thread. |
|
|
msg159271 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-04-25 12:14 |
> in the thread (is that right?). This looks like it. > It also uses a timeout when trying to join the thread. Perhaps some kind of warning can be printed if joining fails after the timeout? |
|
|
msg159279 - (view) |
Author: Richard Oudkerk (sbt) *  |
Date: 2012-04-25 13:03 |
Warning added to patch. |
|
|
msg159284 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-04-25 13:39 |
Hmm, I thought either multiprocessing's logging facilities, or the warnings module, could be used. That way, people have a control over verbosity of stderr messages. |
|
|
msg159286 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2012-04-25 13:57 |
mp_resource_sharer_stop.patch: this patch changes two different things, the patch should be splitted. One patch to fix test_socket. One patch to call pthread_sigmask(). I don't think that you should call pthread_sigmask(). It looks like a workaround for this issue, whereas resource_sharer.stop() is the correct fix. |
|
|
msg159297 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-04-25 15:16 |
> I don't think that you should call pthread_sigmask(). It looks like a > workaround for this issue, whereas resource_sharer.stop() is the > correct fix. The problem is not only with test_multiprocessing and test_socket; any test which uses multiprocessing could have side effects on any subsequent tests which uses signals. Also, applicative code could be affected. So I think pthread_sigmask() *is* the solution. |
|
|
msg159321 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2012-04-25 17:01 |
mp_resource_sharer_stop.patch: you should add a timeout argument to stop() instead of hardcoding a timeout of 5 seconds. It is maybe safer to block until the thread exits by default (so timeout=None by default). For the new method: it may be nice to document it. Having to import resource_sharer from multiprocessing.reduction is maybe not the best possible API :-/ + from multiprocessing.reduction import resource_sharer + resource_sharer.stop() > Also, applicative code could be affected. What is the effect of the patch? For example, on CTRL+c? I don't know the multiprocessing module nor this "resource sharer" thread. |
|
|
msg159323 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-04-25 17:25 |
> For the new method: it may be nice to document it. Having to import > resource_sharer from multiprocessing.reduction is maybe not the best > possible API :-/ resource_sharer is a private API, it's not meant to be used by anyone outside of the stdlib. > What is the effect of the patch? For example, on CTRL+c? Why should it have an effect on CTRL+c? Please explain yourself better. > I don't know the multiprocessing module nor this "resource sharer" > thread. Time to learn about them perhaps :) |
|
|
msg159382 - (view) |
Author: Richard Oudkerk (sbt) *  |
Date: 2012-04-26 15:27 |
New patch which adds timeout to ResourceSharer.stop() which defaults to 0. When stop() fails it now uses the logger. pthread_sigmask() only stops this background thread from receiving signals. Signals will still be delivered to other threads, so it should not have any effect on the handling of Ctrl-C. |
|
|
msg159499 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2012-04-27 21:52 |
New changeset f163c4731c58 by Antoine Pitrou in branch 'default': Issue #14666: stop multiprocessing's resource-sharing thread after the tests are done. http://hg.python.org/cpython/rev/f163c4731c58 |
|
|
msg159526 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-04-28 14:38 |
This should have fixed it. If now, someone reopen the issue :) |
|
|
msg159537 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2012-04-28 20:33 |
> This should have fixed it. If now, someone reopen the issue :) Thanks! |
|
|