[Python-Dev] file() vs open(), round 7 (original) (raw)
"Martin v. Löwis" martin at v.loewis.de
Tue Dec 27 18:54:30 CET 2005
- Previous message: [Python-Dev] file() vs open(), round 7
- Next message: [Python-Dev] file() vs open(), round 7
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
M.-A. Lemburg wrote:
Here's a rough draft:
def textopen(name, mode="r", encoding=None): if "U" not in mode: mode += "U" The "U" is not needed when opening files using codecs - these always break lines using .splitlines() which breaks lines according to the Unicode rules and also knows about the various line break variants on different platforms.
Still, codecs typically don't implement universal newlines correctly. If you specify 'U', then do .read(), you deserve to get \n (U+0010) as the line separator; with most codecs, you get whatever line breaks where in the file.
Passing 'U' to the underlying stream is wrong, as well: if the stream is double-byte oriented (e.g. UTF-16), the 'U' filtering will rarely do anything, but if it does something, it will be wrong.
I agree that it would be desirable to have textopen always default to universal newlines, however, this is difficult to implement.
Regards, Martin
- Previous message: [Python-Dev] file() vs open(), round 7
- Next message: [Python-Dev] file() vs open(), round 7
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]