[Python-3000] io library/PEP 3116 bits (original) (raw)
Guido van Rossum guido at python.org
Mon Jul 30 19:20:50 CEST 2007
- Previous message: [Python-3000] io library/PEP 3116 bits
- Next message: [Python-3000] io library/PEP 3116 bits
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 7/30/07, skip at pobox.com <skip at pobox.com> wrote:
I was looking at PEP 3116 to try and figure out what the newline keyword argument was for (it was mentioned in a couple replies to some checkin comments and I see it in io.py). It's not really mentioned in the PEP as far as I could tell other than this:
Some new features include universal newlines and character set encoding and decoding. The io.open() docstring has this to say: newline: optional newlines specifier; must be None, '\n' or '\r\n'; specifies the line ending expected on input and written on output. If None, use universal newlines on input and use os.linesep on output. Shouldn't '\r' be provided as an option for Macs? Also, shouldn't the "U" mode flag be discarded (2to3 could maybe do this)? Is this particular bit of backwards compatibility all that necessary?
I don't think \r needs to be supported -- OSX uses \n; Python 3.0 isn't going to be ported to MacOS 9. We discussed this before; I promised I'd add \r support if anyone can find a current use case for it. So far none have been reported.
Regarding dropping 'U': agreed. But since the fixer hasn't been written yet it hasn't been dropped yet. We need help for little niggling details like this!
The other thing I wanted to comment on is the default value for n in the various read methods. In some places it's -1 (why not zero? *), but in other places it's None, with presumably the same meaning. Shouldn't this be consistent across all read methods? The couple read methods mentioned in PEP 3116 only mention n=-1 as a default.
Skip (*) A few days ago at work I saw someone check in a piece of code with f.read(-1) That looked so strange to me I had to look up its meaning. I don't think I had ever seen someone explicitly call read with a -1 arg.
read(0) means to read zero bytes. It always returns an empty string (or byte array). There are plenty of end cases where this is useful.
read(), read(None) and read(-1) are all synonyms, meaning "read until EOF". The reason there are three spellings is mostly historic; because there are so many different file-like objects and not all of them implemented this consistently. Since the argument is an integer, it's the easiest to use -1 as the default; but since some classes used None as the default instead, some people started passing None, and then the need was born to support both.
Arguably this was a bad idea, and we should add a new API readall() (one of the implementations already has this, and read(-1) calls it). Then the 2to3 fixer will have to recognize this. I welcome patches!
But right now, getting the number of failing unit tests in the py3k-struni branch down to zero is more important. To help, see http://wiki.python.org/moin/Py3kStrUniTests.
-- --Guido van Rossum (home page: http://www.python.org/~guido/)
- Previous message: [Python-3000] io library/PEP 3116 bits
- Next message: [Python-3000] io library/PEP 3116 bits
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]