msg57902 - (view) |
Author: Bill Fenner (fenner) |
Date: 2007-11-28 05:15 |
When a field has internal line breaks, e.g., foo,"bar baz biff",boo that is actually 3 lines, but one csv-file row. csv.reader() converts this to ['foo', 'bar\nbaz\nbiff', 'boo']. This is a reasonable behavior. Unfortunately, csv.writer() does not use the dialect's lineterminator setting for values with such internal linebreaks. This means that the resulting file will have a mix of line-termination styles: foo,"bar\n baz\n biff",boo\r\n If the reading csv implementation is strict about its line termination, these line breaks will not be read properly. |
|
|
msg57903 - (view) |
Author: Bill Fenner (fenner) |
Date: 2007-11-28 05:19 |
I realized that my description was not crystal clear - the file being read has \r\n line terminators - in the format that I used later, the input file is foo,"bar\r\n baz\r\n biff",boo\r\n |
|
|
msg57904 - (view) |
Author: Gregory P. Smith (gregory.p.smith) *  |
Date: 2007-11-28 05:33 |
release25-maint and trunk (2.6) appear to do the correct thing when testing on my ubuntu gutsy linux x86 box. test script and file attached. The problem is reproducable in a release24-maint build compiled 2007-11-05. |
|
|
msg57905 - (view) |
Author: Gregory P. Smith (gregory.p.smith) *  |
Date: 2007-11-28 05:36 |
attaching the test input file. use od -x or similar to compare the new.csv output with .csv to see if the problem happened. its 2.4.. that may be old enough to be considered dead |
|
|
msg87624 - (view) |
Author: Daniel Diniz (ajaksu2) *  |
Date: 2009-05-12 13:23 |
I get different behavior in py3k compared to trunk: ~/trunk-py$ ./python issue1511_py3k.py [['foo', 'bar\r\nbaz\r\nbiff', 'boo']] 'foo,"bar\r\nbaz\r\nbiff",boo\r\n' ~/trunk-py$ ../py3k/python issue1511_py3k.py [['foo', 'bar\nbaz\nbiff', 'boo']] 'foo,"bar\nbaz\nbiff",boo\n' |
|
|
msg87631 - (view) |
Author: Skip Montanaro (skip.montanaro) *  |
Date: 2009-05-12 13:59 |
Daniel> Daniel Diniz <ajaksu@gmail.com> added the comment: Daniel> I get different behavior in py3k compared to trunk: Daniel> ~/trunk-py$ ./python issue1511_py3k.py Daniel> [['foo', 'bar\r\nbaz\r\nbiff', 'boo']] Daniel> 'foo,"bar\r\nbaz\r\nbiff",boo\r\n' Daniel> ~/trunk-py$ ../py3k/python issue1511_py3k.py Daniel> [['foo', 'bar\nbaz\nbiff', 'boo']] Daniel> 'foo,"bar\nbaz\nbiff",boo\n' Try adding newline='' to your open calls. I believe that will preserve the CRLF pairs. Skip |
|
|
msg87632 - (view) |
Author: Daniel Diniz (ajaksu2) *  |
Date: 2009-05-12 14:08 |
You're right, sorry about the noise. Closing as out of date. |
|
|