Issue 10954: csv.reader/writer to raise exception if mode is binary or newline is not '' (original) (raw)

Created on 2011-01-20 10:32 by lregebro, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
csv.rst.diff skip.montanaro,2011-01-23 22:19
Messages (17)
msg126596 - (view) Author: Lennart Regebro (lregebro) Date: 2011-01-20 10:32
In Python 2 the file used for csv.writer() should be opened in binary mode, while in Python 3 is should be opened in text mode but with newlines set to ''. This change is neither warned for by python -3, nor is there a fixer for it (and making a fixer would be tricky), thus it provides a surprising API change. I think that csv.writer() should warn or even fail if the file is opened in binary mode under Python 3. Failing is a god option, as a binary file is likely to be a port from Python 2, and you are likely to get the less useful message "must be bytes or buffer, not str". Even if you understand that message, you will then probably just change the file mode from binary to text, but you will not add the lineendings='' parameter, and thusly you might cause subtle error on windows.
msg126600 - (view) Author: John Machin (sjmachin) Date: 2011-01-20 12:18
I believe that both csv.reader and csv.writer should fail with a meaningful message if mode is binary or newline is not ''
msg126809 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2011-01-22 01:04
Failing when passed a bytesIO object seems reasonable. I question the bit about newlines though. The doc does not specify that newlines='' is needed on output. While is says it is needed for input, why? Why is a mix of '\n', '\r\n', and '\r' better than always '\n'? It is not clear to me that the information is available to csv anyway.
msg126810 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2011-01-22 01:15
Changing csv api is a feature request that could only happen in 3.3.
msg126812 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-01-22 02:55
Newline='' is indeed needed. It preserves the newlines so that the csv module can correctly parse them according to the weird csv quoting roles. And for output, the fact that it isn't documented there is a an issue that was only noticed recently.
msg126873 - (view) Author: John Machin (sjmachin) Date: 2011-01-23 05:46
I don't understand "Changing csv api is a feature request that could only happen in 3.3". This is NOT a request for an API change. Lennert's point is that an API change was made in 3.0 as compared with 2.6 but there is no fixer in 2to3. What is requested is for csv.reader/writer to give more meaningful error messages for valid 2.x code that has been put through fixer-less 2to3. The name of the arg is "newline". "newlines" is an attribute that stores what was actually found in "universal newlines" mode. newline='' is needed on input for the same reason that binary mode is required in 2.x: \r and \n may quite validly appear in data, inside a quoted field, and must not be treated as part of a row separator. newline='' is needed on output for the same reason that binary mode is required in 2.x: any \n in the data and any \n in the caller's chosen "line" terminator must be preserved from being changed to os.linesep (e.g. \r\n). "newline" is not available as an attribute of the _io.TextIOWrapper object created by open('xxx.csv', 'w', newline=''); is exposing this possible?
msg126876 - (view) Author: Lennart Regebro (lregebro) Date: 2011-01-23 07:09
In the worst case, not checking for newline='' is not a big problem, as anyone moving from Python 2 to Python 3 will open the file in binary mode. That error message could tell the user to use binary mode newlines=''. Using textmode and newlines is only likely to happen with people writing new code for Python 3 and not reading the docs, which is a different problem. :-)
msg126893 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-01-23 16:06
The API change would be generating an error if newline='' wasn't specified. Amplifying the bytes-case error message would be fine, though. On the other hand, we are in RC phase, and I'm not at all sure this is important enough to go in to an RC. On the gripping hand it is also trivial enough that Georg might approve it.
msg126898 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2011-01-23 19:19
Can we have a concrete proposal in the form of a patch, please?
msg126903 - (view) Author: Skip Montanaro (skip.montanaro) * (Python triager) Date: 2011-01-23 22:17
Looking at the csv.rst file I see this statement early in the py3k docs: If *csvfile* is a file object, it should be opened with ``newline=''``. There is also a footnote about the consequences of leaving it out: .. [#] If ``newline=''`` is not specified, newlines embedded inside quoted fields will not be interpreted correctly. It should always be safe to specify ``newline=''``, since the csv module does its own universal newline handling on input. Finally, the examples all use "newline=''". I see two things to change in the docs: * Replace "should" with "must" in the first quoted sentence above. * Add that sentence to the documentation for the csv.writer() function.
msg126904 - (view) Author: Skip Montanaro (skip.montanaro) * (Python triager) Date: 2011-01-23 22:19
My suggestion attached.
msg126906 - (view) Author: John Machin (sjmachin) Date: 2011-01-23 23:00
Skip, the docs bug is #7198. This is the meaningful-exception bug.
msg131448 - (view) Author: John Machin (sjmachin) Date: 2011-03-19 21:55
The doc patch proposed by Skip on 2001-01-24 for this bug has NOT been reviewed, let alone applied. Sibling bug #7198 has been closed in error. Somebody please help.
msg131452 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2011-03-19 22:28
Since this is not a doc issue, doc people would not especially see it. That aside... What is *your* review. Does it satisfy you? Answer on #7198 if you want. And please be a bit patient as people are learning the new hg system.
msg131459 - (view) Author: John Machin (sjmachin) Date: 2011-03-19 23:03
Terry, I have already made the point """the docs bug is #7198. This is the meaningful-exception bug.""" My review is """changing 'should' to 'must' is not very useful without a consistent interpretation of what those two words mean and without any enforcement of use of newline=''. I was patient enough to wait 2 months for a review of my "doc patch" on #7198. My issues are that the 3.2 docs have NOT been changed (have a look at the csv.writer paragraph: do you see the word "newline" anywhere??), #7198 has been closed without any action, and BOTH of these two issues (which have in effect been lurking about since Python 3.0.0alpha) appear to have been abandoned.
msg236546 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2015-02-24 21:18
I've changed this issue to reflect what I think it should be saying.
msg236555 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-02-24 23:43
I suspect it may not be practical to check the newline translation mode of a TextIOWrapper or StringIO stream, and I don’t think newline translation is even required in general for text stream classes. Beware that the “newlines” attribute isn’t going to help; see Issue 8895.
History
Date User Action Args
2022-04-11 14:57:11 admin set github: 55163
2021-10-20 23:10:57 iritkatriel set status: open -> closedsuperseder: Close 2to3 issues and list them hereresolution: wont fixstage: resolved
2019-03-15 22:44:13 BreamoreBoy set nosy: - BreamoreBoy
2015-02-24 23:43:05 martin.panter set nosy: + martin.pantermessages: +
2015-02-24 23:15:44 skip.montanaro set nosy: - skip.montanaro
2015-02-24 21🔞30 BreamoreBoy set versions: + Python 3.4, Python 3.5, - Python 3.2nosy: + BreamoreBoytitle: No warning for csv.writer API change -> csv.reader/writer to raise exception if mode is binary or newline is not ''messages: + type: enhancement -> behavior
2011-03-19 23:03:54 sjmachin set nosy:skip.montanaro, georg.brandl, terry.reedy, sjmachin, lregebro, r.david.murraymessages: +
2011-03-19 22:28:32 terry.reedy set nosy:skip.montanaro, georg.brandl, terry.reedy, sjmachin, lregebro, r.david.murraymessages: +
2011-03-19 21:55:54 sjmachin set nosy: + skip.montanaromessages: +
2011-03-19 19:09:43 skip.montanaro set nosy: - skip.montanaro
2011-01-23 23:00:17 sjmachin set nosy:skip.montanaro, georg.brandl, terry.reedy, sjmachin, lregebro, r.david.murraymessages: +
2011-01-23 22:19:39 skip.montanaro set files: + csv.rst.diffmessages: + keywords: + patchnosy:skip.montanaro, georg.brandl, terry.reedy, sjmachin, lregebro, r.david.murray
2011-01-23 22:17:33 skip.montanaro set nosy:skip.montanaro, georg.brandl, terry.reedy, sjmachin, lregebro, r.david.murraymessages: +
2011-01-23 19:19:11 georg.brandl set nosy:skip.montanaro, georg.brandl, terry.reedy, sjmachin, lregebro, r.david.murraymessages: +
2011-01-23 16:06:28 r.david.murray set nosy: + georg.brandlmessages: +
2011-01-23 07:09:44 lregebro set nosy:skip.montanaro, terry.reedy, sjmachin, lregebro, r.david.murraymessages: +
2011-01-23 05:46:50 sjmachin set nosy:skip.montanaro, terry.reedy, sjmachin, lregebro, r.david.murraymessages: + versions: + Python 3.2, - Python 3.3
2011-01-22 02:55:53 r.david.murray set nosy: + r.david.murraymessages: +
2011-01-22 01:15:13 terry.reedy set versions: + Python 3.3, - Python 3.1, Python 2.7, Python 3.2nosy: + skip.montanaromessages: + type: behavior -> enhancement
2011-01-22 01:04:38 terry.reedy set nosy: + terry.reedymessages: + versions: - Python 2.6
2011-01-20 12🔞29 sjmachin set messages: +
2011-01-20 12:05:00 lregebro set nosy: + sjmachin
2011-01-20 10:32:25 lregebro create