[Python-Dev] CSV, bytes and encodings (original) (raw)
Antoine Pitrou solipsis at pitrou.net
Wed Apr 1 12:07:15 CEST 2009
- Previous message: [Python-Dev] 3.1a2
- Next message: [Python-Dev] CSV, bytes and encodings
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
R. David Murray <rdmurray bitdance.com> writes:
Having read through the ticket, it seems that a CSV file must be (and 2.6 was) treated as a binary file, and part of the CSV module's job is to convert that binary data to and from strings.
IMO this interpretation is flawed. In 2.6 there is no tangible difference between "binary" and "text" files, except for newline handling. Also, as a matter of fact, if you want the 2.x CSV module to read a file with Windows line endings, you have to open the file in "rU" mode (that is, the closest we have to a moral equivalent of the 3.x text files).
Therefore, I don't think 2.x is of any guidance to us for what 3.x should do.
I see three possible practical cases that, ideally, the 3.x CSV module should be able to handle:
- be handed a binary file (yielding bytes) without an encoding: in this case, the CSV module should return lists of bytes objects
- be handed a text file (yielding str) without an encoding: in this case, the CSV module should return lists of str objects
- be handed a binary file (yielding bytes) with an encoding: in this case, the CSV module should also return lists of str objects
I think 2 and 3 both /should/ be supported (for 3, it's probably enough to wrap the binary file in a TextIOWrapper ;-)). 1 would be convenient too, but perhaps more work than it deserves (since it means the CSV module must be able to deal internally with two different datatypes: bytes and str).
The documentation says "If csvfile is a file object, it must be opened with the ‘b’ flag on platforms where that makes a difference."
The documentation is, IMO, wrong even in 2.x. Just yesterday I had to open a CSV file in 'rU' mode because it had Windows line endings and I'm under Linux....
Regards
Antoine.
- Previous message: [Python-Dev] 3.1a2
- Next message: [Python-Dev] CSV, bytes and encodings
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]