msg307851 - (view) |
Author: Licht Takeuchi (licht-t) * |
Date: 2017-12-08 14:43 |
Inconsistent behavior while reading a single column CSV. I have the patch and waiting for the CLA response. # Case 1 ## Input ``` import csv fp = open('test.csv', 'w') w = csv.writer(fp) w.writerow(['']) w.writerow(['1']) fp.close() ``` ## Output ``` "" 1 ``` # Case 2 ## Input ``` import csv fp = open('test.csv', 'w') w = csv.writer(fp) w.writerow(['1']) w.writerow(['']) fp.close() ``` ## Output ``` 1 ``` |
|
|
msg307939 - (view) |
Author: Nitish (nitishch) * |
Date: 2017-12-10 03:18 |
Which scenario you think is the wrong behaviour in this case? First one or second one? I don't know much about csv module, but I thought it was a deliberate choice made to quote all empty lines and hence considered the second scenario as buggy. But your pull requests seems to fix the first case. Am I missing something here? |
|
|
msg307940 - (view) |
Author: Licht Takeuchi (licht-t) * |
Date: 2017-12-10 05:06 |
I think the first one is buggy and there are two reasons. 1. The both are valid CSV. The double quoting is unnecessary. Some other applications, eg. Excel, does not use the double quoting. Also, the current implementation make to quote only if the string is '' and the output is at the first line. 2. '' is not quoted when the two columns case. ## Input: ``` import csv fp = open('test.csv', 'w') w = csv.writer(fp, dialect=None) w.writerow(['', '']) w.writerow(['3', 'a']) fp.close() ``` ## Output: ``` , 3,a ``` These seem inconsistent and the quoting is unnecessary in this case. # References http://www.ietf.org/rfc/rfc4180.txt |
|
|
msg307941 - (view) |
Author: Licht Takeuchi (licht-t) * |
Date: 2017-12-10 05:15 |
The current implementation does not quote in most case. IOW, the patch which makes all '' is quoted is the breaking change (Note that there are some applications does not use quoting). |
|
|
msg307984 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2017-12-10 20:29 |
The second case is indeed the bug, as can be seen by running the examples against python2.7. It looks like this was probably broken by 7901b48a1f89 from issue 23171. |
|
|
msg307986 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2017-12-10 20:31 |
Serhiy, since it was your patch that probably introduced this bug, can you take a look? Obviously it isn't a very high priority bug, since no one has reported a problem (even this issue isn't reporting the change in behavior as a *problem* :) |
|
|
msg307997 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2017-12-10 22:25 |
For restoring the 3.4 behavior the single empty field must be quoted. This allows to distinguish a 1-element row with the single empty field from an empty row. |
|
|
msg308009 - (view) |
Author: Licht Takeuchi (licht-t) * |
Date: 2017-12-11 00:20 |
Thanks for your investigation! Would you mind if I create a new patch? |
|
|
msg308050 - (view) |
Author: Licht Takeuchi (licht-t) * |
Date: 2017-12-11 15:05 |
PR is now fixed so as to follow the behavior on Python 2.7! |
|
|
msg308102 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2017-12-12 09:57 |
New changeset 2001900b0c02a397d8cf1d776a7cc7fcb2a463e3 by Serhiy Storchaka (Licht Takeuchi) in branch 'master': bpo-32255: Always quote a single empty field when write into a CSV file. (#4769) https://github.com/python/cpython/commit/2001900b0c02a397d8cf1d776a7cc7fcb2a463e3 |
|
|
msg308103 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2017-12-12 09:58 |
Thank you for your contribution Licht! |
|
|
msg308109 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2017-12-12 10:56 |
New changeset ce5a3cd9b15c9379753aefabd696bff11495cbbb by Serhiy Storchaka (Miss Islington (bot)) in branch '3.6': bpo-32255: Always quote a single empty field when write into a CSV file. (GH-4769) (#4810) https://github.com/python/cpython/commit/ce5a3cd9b15c9379753aefabd696bff11495cbbb |
|
|