[Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces (original) (raw)

"Martin v. Löwis" martin at v.loewis.de
Sat Apr 25 14:17:14 CEST 2009


Simon Cross wrote:

Unfortunately, for Windows, the situation would be exactly the opposite: the byte-oriented interface cannot represent all data; only the character-oriented API can. Is the second part of this actually true? My understanding may be flawed, but surely all Unicode data can be converted to and from bytes using UTF-8?

[I hope, by "second part", you refer to the part that I left]

It's true that UTF-8 could represent all Windows file names. However, the byte-oriented APIs of Windows do not use UTF-8, but instead, they use the Windows ANSI code page (which varies with the installation).

Given this, can't people who must have access to all files / environment data just use the bytes interface?

No, because the Windows API would interpret the bytes differently, and not find the right file.

Regards, Martin



More information about the Python-Dev mailing list