Issue 12466: sporadic failures of test_close_fds and test_pass_fds in test_subprocess (original) (raw)

Issue12466

Created on 2011-07-01 12:56 by vstinner, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (17)
msg139575 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-07-01 12:56
====================================================================== FAIL: test_close_fds (test.test_subprocess.POSIXProcessTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/db3l/buildarea/3.x.bolen-tiger/build/Lib/test/test_subprocess.py", line 1302, in test_close_fds "Some fds were left open") AssertionError: {9} is not false : Some fds were left open http://www.python.org/dev/buildbot/all/builders/x86%20Tiger%203.x/builds/2825/steps/test/logs/stdio test_pass_fds() is already skipped on Mac OS X Tiger because of an OS bug: see issue #12230.
msg178889 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-01-03 01:47
Sporadic failure, I don't know if it was reproduced recently, and the issue has no activity since more than 6 months => I'm closing the issue.
msg183756 - (view) Author: Ismail Donmez (donmez) * Date: 2013-03-08 19:29
We are now hitting this on openSUSE 12.3, here is the relevant log (ignore the timestamps) : [ 1041s] ====================================================================== [ 1041s] FAIL: test_close_fds (test.test_subprocess.POSIXProcessTestCase) [ 1041s] ---------------------------------------------------------------------- [ 1041s] Traceback (most recent call last): [ 1041s] File "/home/abuild/rpmbuild/BUILD/Python-3.Lib/test/test_subprocess.py", line 1533, in test_close_fds [ 1041s] "Some fds were left open") [ 1041s] AssertionError: {3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13} is not false : Some fds were left open [ 1041s] [ 1041s] ====================================================================== [ 1041s] FAIL: test_pass_fds (test.test_subprocess.POSIXProcessTestCase) [ 1041s] ---------------------------------------------------------------------- [ 1041s] Traceback (most recent call last): [ 1041s] File "/home/abuild/rpmbuild/BUILD/Python-3.Lib/test/support.py", line 457, in wrapper [ 1041s] return func(*args, **kw) [ 1041s] File "/home/abuild/rpmbuild/BUILD/Python-3.Lib/test/test_subprocess.py", line 1576, in test_pass_fds [ 1041s] "fd to be closed passed") [ 1041s] AssertionError: {4, 5, 6, 7, 8, 9, 10, 11, 12} is not false : fd to be closed passed This is Python 3.3.0 and Linux 3.7 on 64bit.
msg184611 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2013-03-19 07:14
Ismail, does it always fail or is it a sporadic failure?
msg184625 - (view) Author: Ismail Donmez (donmez) * Date: 2013-03-19 10:05
I can reproduce the error locally. But I see something in the log might be related to this (possibly not but anyway): [352/368/1] test_multiprocessing test_multiprocessing skipped -- This platform lacks a functioning sem_open implementation, therefore, the required synchronization primitives needed will not function, see issue 3770. This is weird because pyconfig.h has HAVE_SEM_OPEN set to 1.
msg184735 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2013-03-20 04:24
fyi - sem_open is unrelated, nothing in subprocess needs that.
msg184741 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2013-03-20 06:12
Some way for me to reproduce this is needed.
msg184989 - (view) Author: jan matejek (matejcik) * Date: 2013-03-22 17:27
Ismail, the sem_open thing is because in autobuild you don't get /proc or somesuch - it's a problem of our buildsystem, otherwise the test passes. I can reproduce the failure in autobuild, but not separately yet. Will keep you posted, signs point to yet another misfeature of our buildsystem, but maybe it's something with test ordering...
msg184991 - (view) Author: Ismail Donmez (donmez) * Date: 2013-03-22 17:37
Jan, thanks for the excellent explanation. We can close this bug I guess then?
msg184994 - (view) Author: jan matejek (matejcik) * Date: 2013-03-22 17:50
Not just yet, please. I can now reproduce the test_subprocess failures outside autobuild too. Interestingly, it's the reverse here: test_subprocess passes when /proc is not present, but fails when it is. I'm pretty sure that we hijacked a completely different bug with a similar symptom :e) but whatever. I'll just go ahead and debug this.
msg184999 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2013-03-22 19:29
fyi - subprocess uses /proc to get a list of open fd's in the child process before exec when possible. search for FD_DIR in http://hg.python.org/cpython/file/d674bbbed333/Modules/_posixsubprocess.c
msg185386 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2013-03-27 22:16
Hummm... For those experiencing failures when /proc is mounted: do you have a grsecurity-patched kernel? If RBAC is enabled, /proc/self/fd is empty, so this approach won't work...
msg185455 - (view) Author: jan matejek (matejcik) * Date: 2013-03-28 16:28
i don't think that's it, look at the result output in comment #183756: the list of fds is correct, except that they aren't closed as they should be. i can't reproduce the result when running code by hand, so it's either a race or an issue with test setup. so far i figured out that this goes away when patch from issue #16962 is reverted
msg185457 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2013-03-28 16:52
> i don't think that's it, look at the result output in comment #183756: the list of fds is correct, except that they aren't closed as they should be. Of course the list is correct: the message is printed by the parent process, which opened the FDs ;-) > i can't reproduce the result when running code by hand, so it's either a race or an issue with test setup. If it's random, yes. One way to debug this would be to run the test with "strace -f". > so far i figured out that this goes away when patch from issue #16962 is reverted Note that the patch doesn't retry getdents64() on EINTR, and retries close() on EINTR while it's not advised, see this post by Linus: http://article.gmane.org/gmane.linux.kernel/330619
msg185460 - (view) Author: jan matejek (matejcik) * Date: 2013-03-28 18:53
so apparently the problem was that we're using the borked original patch for issue #16962, instead of the fixed version that's actually in repos. (The d_type field was missing, so every filename had an extra character prepended to it. AFAICT, all mysterious circumstances that I reported would trace back to workplace mess, like running the test in a terminal with a different build than i thought...) sorry for taking your time, folks, the bug can be closed now.
msg185462 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2013-03-28 19:22
phew, yay. because i didn't see anything obviously wrong with the #16962 commit. also, regarding checking for EINTR on some of the close() calls. It isn't a big deal. This _posixsubprocess.c code that contains those checks is always executing single-threaded post-fork so there is no chance of a race condition even though the EINTR check and retry are pointless as Linus mentions. They will never cause errant behavior but I'll "fix" them in a future commit for consistency.
msg185492 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2013-03-29 06:52
Greg, the original issue was about an OS X failure, and it's still a problem AFAICT. However it's definitely not a problem with your patch, but an OS X kernel bug (we've had another similar issue some time ago I think), so we might as well skip the offending tests on this OS X version. As for close() and EINTR, if you plan to do the change, it would probably be interesting to check if there are other such patterns in the stdlib.
History
Date User Action Args
2022-04-11 14:57:19 admin set github: 56675
2013-03-29 06:52:41 neologix set messages: +
2013-03-28 19:22:46 gregory.p.smith set status: open -> closedversions: + Python 3.4type: behaviormessages: + resolution: not a bug
2013-03-28 18:53:12 matejcik set messages: +
2013-03-28 16:52:18 neologix set messages: +
2013-03-28 16:28:18 matejcik set messages: +
2013-03-27 22:16:38 neologix set nosy: + neologixmessages: +
2013-03-22 19:29:20 gregory.p.smith set messages: +
2013-03-22 17:50:42 matejcik set messages: +
2013-03-22 17:37:17 donmez set messages: +
2013-03-22 17:27:07 matejcik set nosy: + matejcikmessages: +
2013-03-20 06:12:19 gregory.p.smith set messages: + stage: test needed
2013-03-20 04:24:44 gregory.p.smith set messages: +
2013-03-19 10:05:03 donmez set messages: +
2013-03-19 08:09:24 pitrou set nosy: + gregory.p.smith
2013-03-19 07:14:43 ezio.melotti set nosy: + ezio.melottimessages: +
2013-03-09 04:21:59 ned.deily set title: sporadic failures of test_close_fds() and test_pass_fds in test_subprocess -> sporadic failures of test_close_fds and test_pass_fds in test_subprocess
2013-03-09 04:21:20 ned.deily set status: closed -> opentitle: test_subprocess.test_close_fds() sporadic failures on Mac OS X Tiger -> sporadic failures of test_close_fds() and test_pass_fds in test_subprocessresolution: out of date -> (no value)versions: + Python 3.3
2013-03-08 19:29:34 donmez set nosy: + donmezmessages: +
2013-01-03 01:47:23 vstinner set status: open -> closedresolution: out of datemessages: +
2011-07-01 12:56:21 vstinner create