[Python-3000] Non-blocking I/O? (Draft PEP for New IO system) (original) (raw)

Daniel Stutzbach daniel at stutzbachenterprises.com
Wed Mar 7 01:44:37 CET 2007


+1

I liked the original design at first, but after fleshing it and out and seeing all of the corner cases raised here, the Buffered I/O interface for non-blocking would have to be so different that it would not make much sense to make it the same type.

The raw I/O layer semantics are unchanged, right? (.read() returns 0 bytes on EOF, None on an EAGAIN/EWOULDBLOCK condition, raise IOError on any other problem)

On 3/6/07, Guido van Rossum <guido at python.org> wrote:

Reading this and all the other discussion on the proper semantics for non-blocking I/O I think I may have overreached in trying to support non-blocking I/O at all levels of the new I/O stack. There probably aren't enough use cases for wanting to support readline() returning None if no full line if input is available yet to warrant the additional complexities -- and I haven't even looked very carefully at incremental codecs, which introduce another (small) buffer.

I think maybe a useful simplification would be to support special return values to capture EWOULDBLOCK (or equivalent) in the raw I/O interface only. I think it serves a purpose here, since without such support, code doing raw I/O would either require catching IOError all the time and inspecting it for EWOULDBLOCK (or other platform specific values!), or not using the raw I/O interface at all, requiring yet another interface for raw non-blocking I/O. The buffering layer could then raise IOError (or perhaps a special subclass of it) if the raw I/O layer ever returned one of these; e.g. if a buffered read needs to go to the raw layer to satisfy a request and the raw read returns None, then the buffered read needs to raise this error if no data has been taken out of the buffer yet; or it should return a short read if some data was already consumed (since it's hard to "unconsume" data, especially if the requested read length is larger than the buffer size, or if there's an incremental encoder involved). Thus, applications can assume that a short read means either EOF or nonblocking I/O; most apps can safely ignore the latter since it must be explicitly be turned on by the app. For writing, if the buffering layer receives a short write, it should try again; but if it receives an EWOULDBLOCK, it should likewise raise the abovementioned error, since repeated attempts to write in this case would just end up spinning the CPU without making progress. (We should not raise an error if a single short write happens, since AFAIK this is possible for TCP sockets even in blocking mode, witness the addition of the sendall() method.) This means that the buffering layer that sits directly on top of the raw layer must still be prepared to deal with the special return values from non-blocking I/O, but its API to the next layer up doesn't need special return values, since it turns these into IOErrors, and the next layer(s) up won't have to deal with it nor reflect it in their API. Would this satisfy the critics of the current design? --Guido On 3/4/07, Adam Olsen <rhamph at gmail.com> wrote: > On 3/4/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote: > > I'm having trouble seeing what the use case is for > > the buffered non-blocking writes being discussed here. > > > > Doing asynchronous I/O usually doesn't involve > > putting the file descriptor into non-blocking mode. > > Instead you use select() or equivalent, and only > > try to read or write when the file is reported as > > being ready. > > I can't say which is more common, but non-blocking has a safer feel. > Normal code would be select-driven in both, but if you screw up with > non-blocking you get an error, whereas blocking you get a mysterious > hang. > > accept() is the exception. It's possible for a connection to > disappear between the time select() returns and the time you call > accept(), so you need to be non-blocking to avoid hanging. > > > > > For this to work properly, the select() needs to > > operate at the bottom of the I/O stack. Any > > buffering layers sit above that, with requests for > > data propagating up the stack as the file becomes > > ready. > > > > In other words, the whole thing has to have the > > control flow inverted and work in "pull" mode > > rather than "push" mode. It's hard to see how this > > could fit into the model as a minor variation on > > how writes are done. > > Meaning it needs to be a distinct interface and explicitly designed as such. -- --Guido van Rossum (home page: http://www.python.org/~guido/)


Python-3000 mailing list Python-3000 at python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/daniel%40stutzbachenterprises.com

-- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises LLC



More information about the Python-3000 mailing list