msg94437 - (view) |
Author: Bob Cannon (zacktu) |
Date: 2009-10-24 22:08 |
I used csv.writer to open a file for writing with comma as separator and dialect='excel'. I used writerow to write each row into the file. When I execute under linux, each line is terminated by '\r\n'. When I execute under windows, each line is terminated by '\r\r\n'. Thus, under MS Windows, when I read the csv file, there is a blank line between each significant line. I have dropped cvs.writer and now build each line manually and terminate it with '\n'. When the line is written in windows, it is terminated by '\r\n'. That's what should happen. As I see it, writerow with dialect='excel' should only terminate a line with '\n'. Windows will automatically place a '\r' in front of the '\n'. |
|
|
msg94438 - (view) |
Author: Skip Montanaro (skip.montanaro) *  |
Date: 2009-10-24 22:52 |
Your output file should be opened in binary mode. Sounds like you opened it in text mode. |
|
|
msg94441 - (view) |
Author: Bob Cannon (zacktu) |
Date: 2009-10-24 23:10 |
Probably so. I'm sorry to report this as a bug if it's not. I asked abut this on a Python group on IRC and got no suggestions. Thanks for taking a look. |
|
|
msg111694 - (view) |
Author: Andreas Balogh (baloan) |
Date: 2010-07-27 11:12 |
I encountered the same problem. It is unclear that using binary mode for the file is solving the problem. I suggest to add a hint to the documentation. |
|
|
msg111773 - (view) |
Author: Skip Montanaro (skip.montanaro) *  |
Date: 2010-07-28 08:07 |
Can you provide me with a concrete example which fails for you? I don't have ready access to a Windows machine with Python on it but should be able to arrange something at work, however before going through the exercise of spending admin time to install Python I would like to look at code which fails for you first. |
|
|
msg111787 - (view) |
Author: Éric Araujo (eric.araujo) *  |
Date: 2010-07-28 11:03 |
Bob, can you give us some code to reproduce the problem, in the form or a unit test or even just a regular function? It will help confirm the bug and fix it. |
|
|
msg111792 - (view) |
Author: Bob Cannon (zacktu) |
Date: 2010-07-28 11:46 |
Eric, This issue was resolved for me by Skip Montanaro's response less than an hour after I posted it. I didn't understand why a text file had to be binary, but I no longer had a problem with extraneous. In looking back at my message 94441, I think that it was ambiguous and that I should have made it clear that I no longer had a problem. Perhaps in my ignorance as a newbie I didn't close the issue properly. I don't know that I can reproduce the problem any more. I think that I was writing snippets of code to try to isolate the problem and when I used Skip's solution I changed the program and deleted the test code. Please let me know what I can do to help you now. Bob Éric Araujo wrote: > Éric Araujo <merwok@netwok.org> added the comment: > > Bob, can you give us some code to reproduce the problem, in the form or a unit test or even just a regular function? It will help confirm the bug and fix it. > > ---------- > nosy: +merwok > stage: -> unit test needed > title: csv.writer -> Extraneous newlines with csv.writer on Windows > > _______________________________________ > Python tracker <report@bugs.python.org> > <http://bugs.python.org/issue7198> > _______________________________________ > |
|
|
msg111793 - (view) |
Author: Éric Araujo (eric.araujo) *  |
Date: 2010-07-28 12:14 |
If the documentation is not clear enough about requiring binary, it is a doc bug. (P.S. Please strip unneeded quotes. Thanks) |
|
|
msg111832 - (view) |
Author: Skip Montanaro (skip.montanaro) *  |
Date: 2010-07-28 17:19 |
I got access to Python 2.6.5 on Windows and ran this simple example: Python 2.6.5 (r265:79096, Mar 19 2010, 21:48:26) [MSC v.1500 32 bit (Intel)] on win32 Type "copyright", "credits" or "license()" for more information. **************************************************************** Personal firewall software may warn about the connection IDLE makes to its subprocess using this computer's internal loopback interface. This connection is not visible on any external interface and no data is sent to or received from the Internet. **************************************************************** IDLE 2.6.5 >>> f = open("H:sample.csv", "wb") >>> import csv >>> writer = csv.writer(f) >>> writer.writerow([1,2,3]) >>> writer.writerow(['a', 'b', 'c']) >>> del writer >>> f.close() >>> I then looked at the CSV file which it generated. Looked find to me. Each of the two rows was terminated by a single CRLF pair. Then I repeated the "test", opening the file in text mode: >>> f = open("H:sample2.csv", "w") >>> writer = csv.writer(f) >>> writer.writerow([1,2,3]) >>> writer.writerow(['a', 'b', 'c']) >>> del writer >>> f.close() >>> That output does indeed terminate each line with CRCRLF and when viewed in a spreadsheet program such as OpenOffice Calc (probably Excel as well), displays a blank line between the 123 row and the abc row. I've removed the "unit test needed" attribute from the ticket as there is a test_writerows test case in the Python test suite. Also closing again and marking invalid. If you still believe there is actually a problem, feel free to reopen this issue, but also please send me (skip@pobox.com) a short example and the erroneous output it produces for you (attach your two files - don't just embed them in your mail msg). |
|
|
msg111910 - (view) |
Author: Skip Montanaro (skip.montanaro) *  |
Date: 2010-07-29 11:10 |
> If the documentation is not clear enough about requiring binary, it is > a doc bug. The documentation for both csv.reader and csv.writer state (this is from the Python 2.7 version): If *csvfile* is a file object, it must be opened with the 'b' flag on platforms where that makes a difference. I suppose we could be explicit and mention Windows here, but the wording is quite clear. There is really no harm in always opening the file in binary mode, and I do that myself even though I only program on Unix or Mac platforms where it's safe to open the file in text mode. This all changed in Python 3. There, the choice of line ending is up to the programmer, so file objects for use by the csv module are opened with newline='' and when writing CSV data the writer object takes complete control of proper line termination according to the programmer's stated choice of lineterminator. Skip |
|
|
msg124580 - (view) |
Author: John Machin (sjmachin) |
Date: 2010-12-24 00:52 |
Please re-open this. The binary/text mode problem still exists with Python 3.X on Windows. Quite simply, there is no option available to the caller to open the output file in binary mode, because the module is throwing str objects at the file. The module's idea of "taking control" in the default case appears to be to write \r\n which is then processed by the Windows runtime and becomes \r\r\n. Python 3.1.3 (r313:86834, Nov 27 2010, 18:30:53) [MSC v.1500 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import csv >>> f = open('terminator31.csv', 'w') >>> row = ['foo', None, 3.14159] >>> writer = csv.writer(f) >>> writer.writerow(row) 14 >>> writer.writerow(row) 14 >>> f.close() >>> open('terminator31.csv', 'rb').read() b'foo,,3.14159\r\r\nfoo,,3.14159\r\r\n' >>> And it's not just a row terminator problem; newlines embedded in fields are likewise expanded to \r\n by the Windows runtime. |
|
|
msg124598 - (view) |
Author: Skip Montanaro (skip.montanaro) *  |
Date: 2010-12-24 16:38 |
John, The API for the open() builtin function has changed. You should open the output file with newline="" instead of using the default. Take a look at the documentation for open() and csv.reader: http://docs.python.org/py3k/library/functions.html?highlight=open#open http://docs.python.org/py3k/library/csv.html?highlight=csv.reader#csv.reader Note the form of the open() call in the csv.reader example. This one snuck by me as well. Python 3 underwent a lot of change in the I/O subsystem. This was one of them. If changing the form of the open() call doesn't fix the problem, let me know. Skip |
|
|
msg124678 - (view) |
Author: John Machin (sjmachin) |
Date: 2010-12-26 20:52 |
Skip, I'm WRITING, not reading.. Please read the 3.1 documentation for csv.writer. It does NOT mention newline='', and neither does the example. Please fix. Other problems with the examples: (1) They encourage a bad habit (open inside the call to reader/writer); good practice is to retain the reference to the file handle (preferably with a "with" statement) so that it can be closed properly. (2) delimiter=' ' is very unrealistic. The documentation for both 2.x and 3.x should be much more explicit about what is needed in open() for csv to work properly and portably: 2.x read: use mode='rb' -- otherwise fail on Windows 2.x write: use mode='wb' -- otherwise fail on Windows 3.x read: use newline='' -- otherwise fail unconditionally(?) 3.x write: use newline='' -- otherwise fail on Windows The 2.7 documentation says """If csvfile is a file object, it must be opened with the 'b' flag on platforms where that makes a difference""" ... in my experience, people are left asking "what platforms? what difference?"; Windows should be mentioned explicitly. |
|
|
msg124682 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2010-12-26 22:52 |
OK, I'm reopening this as a doc issue, since currently the Python3 writer docs do not mention newline='', and it is indeed required on Windows. John, would you care to suggest a doc patch? I agree with Skip that "where it makes a difference" is more precise than specifically mentioning Windows, even if less useful in this context. That is how the 'b' mode is documented in the open documentation. To fix the problem with the CSV docs, the recommendation to use 'b' can simply be made unconditional, as it is for newline='' in python3. |
|
|
msg126593 - (view) |
Author: John Machin (sjmachin) |
Date: 2011-01-20 06:43 |
"docpatch" for 3.x csv docs: In the csv.writer docs, insert the sentence "If csvfile is a file object, it should be opened with newline=''." immediately after the sentence "csvfile can be any object with a write() method." In the closely-following example, change the open call from "open('eggs.csv', 'w')" to "open('eggs.csv', 'w', newline='')". In section 13.1.5 Examples, there are 2 reader cases and 1 writer case that likewise need inserting ", newline=''" in the open call. |
|
|
msg131400 - (view) |
Author: John Machin (sjmachin) |
Date: 2011-03-19 07:55 |
Can somebody please review my "doc patch" submitted 2 months ago? |
|
|
msg131417 - (view) |
Author: Skip Montanaro (skip.montanaro) *  |
Date: 2011-03-19 13:59 |
John> John Machin <sjmachin@lexicon.net> added the comment: John> Can somebody please review my "doc patch" submitted 2 months ago? My apologies. I have it in my sandbox, but a combination of the switch to Mercurial and lack of round tuits has conspired to keep me from checking it in. Skip |
|
|
msg131418 - (view) |
Author: Skip Montanaro (skip.montanaro) *  |
Date: 2011-03-19 14:06 |
Actually, I was thinking of another doc patch for the csv module. Your changes (or something very like them) made it into the 3.2 release, as you can see here: http://docs.python.org/py3k/library/csv.html S |
|
|
msg131443 - (view) |
Author: John Machin (sjmachin) |
Date: 2011-03-19 21:27 |
Skip, The changes that I suggested have NOT been made. Please re-read the doc page you pointed to. The "writer" paragraph does NOT mention that newline='' is required when writing. The "writer" examples do NOT include newline=''. The examples have NOT been enhanced by using a "with" statement and not using space as an example delimiter. PLEASE RE-OPEN THIS ISSUE. |
|
|
msg131468 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2011-03-20 02:36 |
New changeset ab27f16f707a by R David Murray in branch 'default': #7198: add newlines='' to csv.writer docs. http://hg.python.org/cpython/rev/ab27f16f707a New changeset 959f666470cc by R David Murray in branch 'default': Merge #7198 doc fix. http://hg.python.org/cpython/rev/959f666470cc New changeset 9d1b1a95bc8f by R David Murray in branch 'default': Merge #7198 doc fix. http://hg.python.org/cpython/rev/9d1b1a95bc8f |
|
|
msg131469 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2011-03-20 02:38 |
Fixed now. Thanks, and sorry for the delay, and the confusion. |
|
|
msg131475 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2011-03-20 04:02 |
Gah, I messed up the push. Now I have to backport the doc fix :( |
|
|
msg131495 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2011-03-20 14:30 |
New changeset 9201455f950b by R David Murray in branch '3.1': #7198: really add newline='' to csv.writer docs. http://hg.python.org/cpython/rev/9201455f950b New changeset fa0563f3b7f7 by R David Murray in branch '3.2': Really merge #7198 http://hg.python.org/cpython/rev/fa0563f3b7f7 New changeset ed0d1e07ce79 by R David Murray in branch 'default': Dummy merge #7198 http://hg.python.org/cpython/rev/ed0d1e07ce79 |
|
|
msg131496 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2011-03-20 14:31 |
OK, now it's really done (I hope!). |
|
|
msg131498 - (view) |
Author: Skip Montanaro (skip.montanaro) *  |
Date: 2011-03-20 14:55 |
John> Skip, The changes that I suggested have NOT been made. Please John> re-read the doc page you pointed to. The "writer" paragraph does John> NOT mention that newline='' is required when writing. The "writer" John> examples do NOT include newline=''. The examples have NOT been John> enhanced by using a "with" statement and not using space as an John> example delimiter. I copied the statement about using newline= from the reader() doc to the writer() doc. All the examples I see (I'm looking at the cpython repo - that is, what will be 3.3) use the with statement and open files using newline=''. I don't think more changes are necessary. I will consult with other Python developers about merging these changes to other active branches. I simply don't understand the new Mercurial workflow well enough to do it properly. Skip |
|
|
msg131501 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2011-03-20 15:40 |
New changeset 88876a264ebe by R David Murray in branch '3.1': Markup fixes for #7198 patch. http://hg.python.org/cpython/rev/88876a264ebe New changeset d0d1235cb66e by R David Murray in branch '3.2': Merge markup fixes for #7198 patch. http://hg.python.org/cpython/rev/d0d1235cb66e New changeset 2a8580f4897c by R David Murray in branch 'default': Markup fixes for #7198 patch. http://hg.python.org/cpython/rev/2a8580f4897c |
|
|