[Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue (original) (raw)

James Y Knight foom at fuhm.net
Tue Sep 30 23:59:10 CEST 2008


On Sep 30, 2008, at 5:40 PM, Martin v. Löwis wrote:

On Windows, we might reject bytes filenames for all file operations: open(), unlink(), os.path.join(), etc. (raise a TypeError or UnicodeError)

Since I've seen no objections to this yet: please no. If we offer a "lower-level" bytes filename API, it should work for all platforms. Unfortunately, it can't. You cannot represent all possible file names in a byte string in Windows (just as you can't do so in a Unicode string on Unix).

As you mention in the parenthetical below, of course it can.

So using byte strings on Windows would work for some files, but fail for others. In particular, listdir might give you a list of file names which you then can't open/stat/recurse into.

(of course, you could use UTF-8 as the file system encoding on Windows, but then you will have to rewrite a lot of C code first)

Yes! If there is a byte-string access method for Windows, pretty
please make it decode from UTF-8 internally and call the Unicode
version of the Windows APIs. The non-unicode windows APIs are pretty
much just broken -- Ideally, Python should never be calling those.

But, I still don't like the idea of propagating the "sometimes a
string, sometimes bytes" APIs...One or the other, please. Either
always strings (if and only if a method for assuring decoding always
succeeds), or always bytes.

James



More information about the Python-Dev mailing list