[Python-Dev] Python 3.0.1 (io-in-c) (original) (raw)
"Martin v. Löwis" martin at v.loewis.de
Wed Jan 28 19:29:07 CET 2009
- Previous message: [Python-Dev] Python 3.0.1 (io-in-c)
- Next message: [Python-Dev] Python 3.0.1 (io-in-c)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Thanks for the explanation. It might be clearer to document this a little more explicitly in the docs for open() (on the basis that people using open() are the most likely to be naive about encodings). I'll see if I can come up with an appropriate doc patch.
Notice that the determination of the specific encoding used is fairly elaborate:
- if IO is to a terminal, Python tries to determine the encoding of the terminal. This is mostly relevant for Windows (which uses, by default, the "OEM code page" in the terminal).
- if IO is to a file, Python tries to guess the "common" encoding for the system. On Unix, it queries the locale, and falls back to "ascii" if no locale is set. On Windows, it uses the "ANSI code page". On OSX, it uses the "system encoding".
- if IO is binary, (clearly) no encoding is used. Network IO is always binary.
- for file names, yet different algorithms apply. On Windows, it uses the Unicode API, so no need for an encoding. On Unix, it (again) uses the locale encoding. On OSX, it uses UTF-8 (just to be clear: this applies to the first argument of open(), not to the resulting file object)
Regards, Martin
- Previous message: [Python-Dev] Python 3.0.1 (io-in-c)
- Next message: [Python-Dev] Python 3.0.1 (io-in-c)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]