Issue 1488934: file.write + closed pipe = no error (original) (raw)

Created on 2006-05-15 16:10 by edemaine, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
test.c edemaine,2006-05-15 16:10 C program illustrating fwrite behavior
blah.py edemaine,2006-07-02 12:35 Test case illustrating bug
Messages (12)
msg28534 - (view) Author: Erik Demaine (edemaine) Date: 2006-05-15 16:10
I am writing a Python script on Linux that gets called via ssh (ssh hostname script.py) and I would like it to know when its stdout gets closed because the ssh connection gets killed. I assumed that it would suffice to write to stdout, and that I would get an error if stdout was no longer connected to anything. This is not the case, however. I believe it is because of incorrect error checking in Objects/fileobject.c's file_write. Consider this example: while True: __print 'Hello' __time.sleep (1) If this program is run via ssh and then the ssh connection dies, the program continues running forever (or at least, over 10 hours). No exceptions are thrown. In contrast, this example does die as soon as the ssh connection dies (within one second): while True: __os.write (1, 'Hello') __time.sleep (1) I claim that this is because os.write does proper error checking, but file.write seems not to. I was surprised to find this intricacy in fwrite(). Consider the attached C program, test.c. (Warning: If you run it, it will create a file /tmp/hello, and it will keep running until you kill it.) While the ssh connection remains open, fwrite() reports a length of 6 bytes written, ferror() reports no error, and errno remains 0. Once the ssh connection dies, fwrite() still reports a length of 6 bytes written (surprise!), but ferror(stdout) reports an error, and errno changes to 5 (EIO). So apparently one cannot tell from the return value of fwrite() alone whether the write actually succeeded; it seems necessary to call ferror() to determine whether the write caused an error. I think the only change necessary is on line 2443 of file_write() in Objects/fileobject.c (in svn version 46003): 2441 n2 = fwrite(s, 1, n, f->f_fp); 2442 Py_END_ALLOW_THREADS 2443 if (n2 != n) { 2444 PyErr_SetFromErrno(PyExc_IOError); 2445 clearerr(f->f_fp); I am not totally sure whether the "n2 != n" condition should be changed to "n2 != n | ferror (f->f_fp)" or simply "ferror (f->f_fp)", but I believe that the condition should be changed to one of these possibilities. The current behavior is wrong. Incidentally, you'll notice that the C code has to turn off signal SIGPIPE (like Python does) in order to not die right away. However, I could not get Python to die by re-enabling SIGPIPE. I tried "signal.signal (signal.SIGPIPE, signal.SIG_DFL)" and "signal.signal (signal.SIGPIPE, lambda x, y: sys.exit ())" and neither one caused death of the script when the ssh connection died. Perhaps I'm not using the signal module correctly? I am on Linux 2.6.11 on a two-CPU Intel Pentium 4, and I am running the latest Subversion version of Python, but my guess is that this error transcends most if not all versions of Python.
msg28535 - (view) Author: Erik Demaine (edemaine) Date: 2006-05-15 16:26
Logged In: YES user_id=265183 One more thing: fwrite() is used in a couple of other places, and I think the same comment applies to them. They are: - file_writelines() in Objects/fileobject.c - w_string() in Python/marshal.c doesn't seem to have any error checking? (At least no ferror() call in marhsal.c...) - string_print() in Objects/stringobject.c doesn't seem to have any error checking (but I'm not quite sure what this means in Python land). - flush_data() in Modules/_hotshot.c - array_tofile() in Modules/arraymodule.c - write_file() in Modules/cPickle.c - putshort(), putlong(), writeheader(), writetab() [and the functions that call them] in Modules/rgbimgmodule.c - svc_writefile() in Modules/svmodule.c
msg28536 - (view) Author: A.M. Kuchling (akuchling) * (Python committer) Date: 2006-06-03 20:16
Logged In: YES user_id=11375 I agree with your analysis, and think your suggested fixes are correct. However, I'm unable to construct a small test case that exercises this bug. I can't even replicate the problem with SSH; when I run a remote script with SSH and then kill SSH with Ctrl-C, the write() gets a -1. Are you terminating SSH in some other way? (I'd really like to have a test case for this problem before committing the fix.)
msg28537 - (view) Author: Erik Demaine (edemaine) Date: 2006-07-02 12:35
Logged In: YES user_id=265183 A simple test case is this Python script (fleshed out from previous example), also attached: import sys, time while True: __print 'Hello' __sys.stdout.flush () __time.sleep (1) Save as blah.py on machine foo, run 'ssh foo python blah.py' on machine bar--you will see 'Hello' every second--then, in another shell on bar, kill the ssh process on bar. blah.py should still be running on foo. ('foo' and 'bar' can actually be the same machine.) The example from the original bug report that uses os.write() instead of print was an example that *does* work.
msg28538 - (view) Author: Erik Demaine (edemaine) Date: 2006-08-09 16:13
Logged In: YES user_id=265183 Just to clarify (as I reread your question): I'm killing the ssh via UNIX (or Cygwin) 'kill' command, not via CTRL-C. I didn't try, but it may be that CTRL-C works fine.
msg59630 - (view) Author: Ralf Schmitt (schmir) Date: 2008-01-09 22:29
the c program is broken as it does not check the error code of fflush. The real problem is buffering. while True: __print 'Hello' __time.sleep (1) will not notice an error until the buffers are flushed. Running python t.py |head -n2 and killing head does not give me an error. with PYTHONUNBUFFERED=1 or when using sys.stdout.flush() the program breaks with: ~/ PYTHONUNBUFFERED=1 python t.py head -n2 ralf@rat64 ok Hello Hello Traceback (most recent call last): File "t.py", line 5, in print "Hello" IOError: [Errno 32] Broken pipe
msg59631 - (view) Author: Ralf Schmitt (schmir) Date: 2008-01-09 22:34
ahh.no. the c program does the fflush on the logfile...sorry.
msg126093 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-01-12 13:16
This is normal behaviour: stdout is normally line buffered (_IOLBF) only if connected to a tty. When it's not connected to a tty, it's full buffered (_IOFBF). This is done on purpose for performance reason. To convince yourself, run $ cat test.py for i in range(1, 1000000): print('hello world') $ time python test.py > /tmp/foo With buffering off (-u option), the same commande takes almost 10 times longer. If the application wants to be sure to receive a SIGPIPE when the pipe's end is closed, it should just flush stdout explicitely (sys.stdout.flush()). Suggesting to close.
msg126109 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-01-12 16:10
Agreed with Charles-François, this is normal behaviour since the bytes written on stdout are buffered (up to a certain size). If calling flush() doesn't solve the issue, please reopen the issue.
msg126116 - (view) Author: Erik Demaine (edemaine) Date: 2011-01-12 17:36
shows a version with flush, and says that it fails. I haven't tested since 2006, though, so I can retry, in particular to see whether the patch suggested in the original post has been applied now.
msg126118 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-01-12 18:09
> shows a version with flush, and says that it fails I cannot reproduce. Either with Python 2.5.2 (!), 2.7 or 3.2, on a remote Debian system. Even using "kill -9" on the local ssh process does shut down the remote Python process. If I comment out the flush() call, it is clearly reproduceable. I would suggest you did something wrong when testing the flush() version.
msg126119 - (view) Author: Erik Demaine (edemaine) Date: 2011-01-12 18:30
I just tested on Python 2.5.2, 2.6.2, and 3.0.1, and I could not reproduce the error (using the code in ). It would seem that file.flush is catching the problem, even though file.write is ignoring the error, but I can't see any changes since 1.5.2 that would have changed this behavior of file.flush. So I'm not sure what happened, but at least it seems to no longer be a bug. Closing.
History
Date User Action Args
2022-04-11 14:56:17 admin set github: 43362
2011-01-12 18:30:09 edemaine set status: pending -> closednosy:akuchling, edemaine, pitrou, schmir, naufraghi, forest_atq, sascha_silbe, neologixmessages: +
2011-01-12 18:09:55 pitrou set status: open -> pendingnosy:akuchling, edemaine, pitrou, schmir, naufraghi, forest_atq, sascha_silbe, neologixmessages: + resolution: not a bugstage: test needed ->
2011-01-12 17:36:02 edemaine set status: closed -> openmessages: + resolution: not a bug -> (no value)nosy:akuchling, edemaine, pitrou, schmir, naufraghi, forest_atq, sascha_silbe, neologix
2011-01-12 16:10:12 pitrou set status: open -> closedmessages: + resolution: not a bugnosy:akuchling, edemaine, pitrou, schmir, naufraghi, forest_atq, sascha_silbe, neologix
2011-01-12 13:16:22 neologix set nosy: + pitrou, neologixmessages: +
2010-11-12 21:00:53 akuchling set assignee: akuchling ->
2010-08-03 22:51:14 terry.reedy set stage: test neededversions: + Python 3.1, Python 2.7, Python 3.2, - Python 2.6, Python 2.5
2009-10-24 16:10:58 naufraghi set nosy: + naufraghitype: behavior
2009-10-07 18:23:12 forest_atq set nosy: + forest_atqversions: + Python 2.6
2009-03-25 12:57:51 sascha_silbe set nosy: + sascha_silbe
2008-01-09 22:34:31 schmir set messages: +
2008-01-09 22:29:07 schmir set nosy: + schmirmessages: +
2006-05-15 16:10:06 edemaine create