Issue 12466: sporadic failures of test_close_fds and test_pass_fds in test_subprocess (original) (raw)
Issue12466
Created on 2011-07-01 12:56 by vstinner, last changed 2022-04-11 14:57 by admin. This issue is now closed.
Messages (17) | ||
---|---|---|
msg139575 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2011-07-01 12:56 |
====================================================================== FAIL: test_close_fds (test.test_subprocess.POSIXProcessTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/db3l/buildarea/3.x.bolen-tiger/build/Lib/test/test_subprocess.py", line 1302, in test_close_fds "Some fds were left open") AssertionError: {9} is not false : Some fds were left open http://www.python.org/dev/buildbot/all/builders/x86%20Tiger%203.x/builds/2825/steps/test/logs/stdio test_pass_fds() is already skipped on Mac OS X Tiger because of an OS bug: see issue #12230. | ||
msg178889 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2013-01-03 01:47 |
Sporadic failure, I don't know if it was reproduced recently, and the issue has no activity since more than 6 months => I'm closing the issue. | ||
msg183756 - (view) | Author: Ismail Donmez (donmez) * | Date: 2013-03-08 19:29 |
We are now hitting this on openSUSE 12.3, here is the relevant log (ignore the timestamps) : [ 1041s] ====================================================================== [ 1041s] FAIL: test_close_fds (test.test_subprocess.POSIXProcessTestCase) [ 1041s] ---------------------------------------------------------------------- [ 1041s] Traceback (most recent call last): [ 1041s] File "/home/abuild/rpmbuild/BUILD/Python-3.Lib/test/test_subprocess.py", line 1533, in test_close_fds [ 1041s] "Some fds were left open") [ 1041s] AssertionError: {3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13} is not false : Some fds were left open [ 1041s] [ 1041s] ====================================================================== [ 1041s] FAIL: test_pass_fds (test.test_subprocess.POSIXProcessTestCase) [ 1041s] ---------------------------------------------------------------------- [ 1041s] Traceback (most recent call last): [ 1041s] File "/home/abuild/rpmbuild/BUILD/Python-3.Lib/test/support.py", line 457, in wrapper [ 1041s] return func(*args, **kw) [ 1041s] File "/home/abuild/rpmbuild/BUILD/Python-3.Lib/test/test_subprocess.py", line 1576, in test_pass_fds [ 1041s] "fd to be closed passed") [ 1041s] AssertionError: {4, 5, 6, 7, 8, 9, 10, 11, 12} is not false : fd to be closed passed This is Python 3.3.0 and Linux 3.7 on 64bit. | ||
msg184611 - (view) | Author: Ezio Melotti (ezio.melotti) * ![]() |
Date: 2013-03-19 07:14 |
Ismail, does it always fail or is it a sporadic failure? | ||
msg184625 - (view) | Author: Ismail Donmez (donmez) * | Date: 2013-03-19 10:05 |
I can reproduce the error locally. But I see something in the log might be related to this (possibly not but anyway): [352/368/1] test_multiprocessing test_multiprocessing skipped -- This platform lacks a functioning sem_open implementation, therefore, the required synchronization primitives needed will not function, see issue 3770. This is weird because pyconfig.h has HAVE_SEM_OPEN set to 1. | ||
msg184735 - (view) | Author: Gregory P. Smith (gregory.p.smith) * ![]() |
Date: 2013-03-20 04:24 |
fyi - sem_open is unrelated, nothing in subprocess needs that. | ||
msg184741 - (view) | Author: Gregory P. Smith (gregory.p.smith) * ![]() |
Date: 2013-03-20 06:12 |
Some way for me to reproduce this is needed. | ||
msg184989 - (view) | Author: jan matejek (matejcik) * | Date: 2013-03-22 17:27 |
Ismail, the sem_open thing is because in autobuild you don't get /proc or somesuch - it's a problem of our buildsystem, otherwise the test passes. I can reproduce the failure in autobuild, but not separately yet. Will keep you posted, signs point to yet another misfeature of our buildsystem, but maybe it's something with test ordering... | ||
msg184991 - (view) | Author: Ismail Donmez (donmez) * | Date: 2013-03-22 17:37 |
Jan, thanks for the excellent explanation. We can close this bug I guess then? | ||
msg184994 - (view) | Author: jan matejek (matejcik) * | Date: 2013-03-22 17:50 |
Not just yet, please. I can now reproduce the test_subprocess failures outside autobuild too. Interestingly, it's the reverse here: test_subprocess passes when /proc is not present, but fails when it is. I'm pretty sure that we hijacked a completely different bug with a similar symptom :e) but whatever. I'll just go ahead and debug this. | ||
msg184999 - (view) | Author: Gregory P. Smith (gregory.p.smith) * ![]() |
Date: 2013-03-22 19:29 |
fyi - subprocess uses /proc to get a list of open fd's in the child process before exec when possible. search for FD_DIR in http://hg.python.org/cpython/file/d674bbbed333/Modules/_posixsubprocess.c | ||
msg185386 - (view) | Author: Charles-François Natali (neologix) * ![]() |
Date: 2013-03-27 22:16 |
Hummm... For those experiencing failures when /proc is mounted: do you have a grsecurity-patched kernel? If RBAC is enabled, /proc/self/fd is empty, so this approach won't work... | ||
msg185455 - (view) | Author: jan matejek (matejcik) * | Date: 2013-03-28 16:28 |
i don't think that's it, look at the result output in comment #183756: the list of fds is correct, except that they aren't closed as they should be. i can't reproduce the result when running code by hand, so it's either a race or an issue with test setup. so far i figured out that this goes away when patch from issue #16962 is reverted | ||
msg185457 - (view) | Author: Charles-François Natali (neologix) * ![]() |
Date: 2013-03-28 16:52 |
> i don't think that's it, look at the result output in comment #183756: the list of fds is correct, except that they aren't closed as they should be. Of course the list is correct: the message is printed by the parent process, which opened the FDs ;-) > i can't reproduce the result when running code by hand, so it's either a race or an issue with test setup. If it's random, yes. One way to debug this would be to run the test with "strace -f". > so far i figured out that this goes away when patch from issue #16962 is reverted Note that the patch doesn't retry getdents64() on EINTR, and retries close() on EINTR while it's not advised, see this post by Linus: http://article.gmane.org/gmane.linux.kernel/330619 | ||
msg185460 - (view) | Author: jan matejek (matejcik) * | Date: 2013-03-28 18:53 |
so apparently the problem was that we're using the borked original patch for issue #16962, instead of the fixed version that's actually in repos. (The d_type field was missing, so every filename had an extra character prepended to it. AFAICT, all mysterious circumstances that I reported would trace back to workplace mess, like running the test in a terminal with a different build than i thought...) sorry for taking your time, folks, the bug can be closed now. | ||
msg185462 - (view) | Author: Gregory P. Smith (gregory.p.smith) * ![]() |
Date: 2013-03-28 19:22 |
phew, yay. because i didn't see anything obviously wrong with the #16962 commit. also, regarding checking for EINTR on some of the close() calls. It isn't a big deal. This _posixsubprocess.c code that contains those checks is always executing single-threaded post-fork so there is no chance of a race condition even though the EINTR check and retry are pointless as Linus mentions. They will never cause errant behavior but I'll "fix" them in a future commit for consistency. | ||
msg185492 - (view) | Author: Charles-François Natali (neologix) * ![]() |
Date: 2013-03-29 06:52 |
Greg, the original issue was about an OS X failure, and it's still a problem AFAICT. However it's definitely not a problem with your patch, but an OS X kernel bug (we've had another similar issue some time ago I think), so we might as well skip the offending tests on this OS X version. As for close() and EINTR, if you plan to do the change, it would probably be interesting to check if there are other such patterns in the stdlib. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:57:19 | admin | set | github: 56675 |
2013-03-29 06:52:41 | neologix | set | messages: + |
2013-03-28 19:22:46 | gregory.p.smith | set | status: open -> closedversions: + Python 3.4type: behaviormessages: + resolution: not a bug |
2013-03-28 18:53:12 | matejcik | set | messages: + |
2013-03-28 16:52:18 | neologix | set | messages: + |
2013-03-28 16:28:18 | matejcik | set | messages: + |
2013-03-27 22:16:38 | neologix | set | nosy: + neologixmessages: + |
2013-03-22 19:29:20 | gregory.p.smith | set | messages: + |
2013-03-22 17:50:42 | matejcik | set | messages: + |
2013-03-22 17:37:17 | donmez | set | messages: + |
2013-03-22 17:27:07 | matejcik | set | nosy: + matejcikmessages: + |
2013-03-20 06:12:19 | gregory.p.smith | set | messages: + stage: test needed |
2013-03-20 04:24:44 | gregory.p.smith | set | messages: + |
2013-03-19 10:05:03 | donmez | set | messages: + |
2013-03-19 08:09:24 | pitrou | set | nosy: + gregory.p.smith |
2013-03-19 07:14:43 | ezio.melotti | set | nosy: + ezio.melottimessages: + |
2013-03-09 04:21:59 | ned.deily | set | title: sporadic failures of test_close_fds() and test_pass_fds in test_subprocess -> sporadic failures of test_close_fds and test_pass_fds in test_subprocess |
2013-03-09 04:21:20 | ned.deily | set | status: closed -> opentitle: test_subprocess.test_close_fds() sporadic failures on Mac OS X Tiger -> sporadic failures of test_close_fds() and test_pass_fds in test_subprocessresolution: out of date -> (no value)versions: + Python 3.3 |
2013-03-08 19:29:34 | donmez | set | nosy: + donmezmessages: + |
2013-01-03 01:47:23 | vstinner | set | status: open -> closedresolution: out of datemessages: + |
2011-07-01 12:56:21 | vstinner | create |