Issue 10956: file.write and file.read don't handle EINTR (original) (raw)

In both Python versions EINTR is not handled properly in the file.write and file.read methods.

------------------------- file.write ------------------------- In Python 2, file.write can write a short amount of bytes, and when it is interrupted there is no way to tell how many bytes it actually wrote. In Python 2 it raises an IOError with EINTR, whereas in Python 3 it simply stops writing and returns the amount of bytes written.

Here is the output of fwrite with Python 2.7 (see attached files). Note also how inconsistent the IOError vs OSError difference is:

python2.7 fwrite.py Writing 100000 bytes, interrupt me with SIGQUIT (^) ^^(3, <frame object at 0x9535ab4>) Traceback (most recent call last): File "fwrite.py", line 16, in print(write_file.write(b'a' * 100000)) IOError: [Errno 4] Interrupted system call read 65536 bytes ^(3, <frame object at 0x9535ab4>) Traceback (most recent call last): File "fwrite.py", line 21, in print('read %d bytes' % len(os.read(r, 100000))) OSError: [Errno 4] Interrupted system call

Because os.read blocks on the second call to read, we know that only 65536 of the 100000 bytes were written.

------------------------- file.read ------------------------- When interrupting file.read in Python 3, it may have read bytes that are inaccessible. In Python 2 it returns the bytes, whereas in Python 3 it raises an IOError with EINTR.

A demonstration:

$ python3.2 fread.py Writing 7 bytes Reading 20 bytes... interrupt me with SIGQUIT (^) ^(3, <frame object at 0x8e1d2d4>) Traceback (most recent call last): File "fread.py", line 18, in print('Read %d bytes using file.read' % len(read_file.read(20))) IOError: [Errno 4] Interrupted system call Reading any remaining bytes... ^(3, <frame object at 0x8e1d2d4>) Traceback (most recent call last): File "fread.py", line 23, in print('reading: %r' % os.read(r, 4096)) OSError: [Errno 4] Interrupted system call

Note how in Python 2 it stops reading when interrupted and it returns our bytes, but in Python 3 it raises IOError while there is no way to access the bytes that it read.

So basically, this behaviour is just plain wrong as EINTR is not an error, and this behaviour makes it impossible for the caller to handle the situation correctly.

Here is how I think Python should behave. I think that it should be possible to interrupt both read and write calls, however, it should also be possible for the user to handle these cases.

file.write, on EINTR, could decide to continue writing if no Python signal handler raised an exception. Analogously, file.read could decide to keep on reading on EINTR if no Python signal handler raised an exception.

This way, it is possible for the programmer to write interruptable code while at the same time having proper file.write and file.read behaviour in case code should not be interrupted. KeyboardInterrupt would still interrupt read and write calls, because it raises an exception. If the programmer decided that writes should finish before allowing such an exception, the programmer could replace the default signal handler for SIGINT.

So, in pseudo-code:

bytes_written = 0

while bytes_written < len(buf):
    result = write(buf)
    
    if result < 0:
        if errno == EINTR 
            if PyErr_CheckSignals() < 0:
                /* Propagate exception from signal handler */
                return NULL
            continue
        else:
            PyErr_SetFromErrno(PyExc_IOError)
            return NULL
    
    buf += result
    bytes_written += result

return bytes_written

Similar code could be used for file.read with the obvious adjustments.

However, in case of an error (either from the write call or from a Python signal handler), it would still be unclear how many bytes were actually written. Maybe (I think this part would be bonus points) we could put the number of bytes written on the exception object in this case, or make it retrievable in some other thread-safe way.

For files with file descriptors in nonblocking mode (and maybe other cases) it will still return a short amount of bytes.

file.write, on EINTR, could decide to continue writing if no Python signal handler raised an exception. Analogously, file.read could decide to keep on reading on EINTR if no Python signal handler raised an exception.

Ok. This would only be done in buffered mode, though, so your fwrite.py example would have to be changed slightly (drop the ",0" in fdopen()).