[Python-Dev] Use our strict mbcs codec instead of the Windows ANSI API (original) (raw)

Victor Stinner victor.stinner at haypocalc.com
Wed Oct 26 10:52:16 CEST 2011


Le Mardi 25 Octobre 2011 10:31:56 Victor Stinner a écrit :

Basically, all functions processing filenames, so most functions of posixmodule.c. Some examples:

- os.listdir(): FindFirstFileA, FindNextFileA, FindCloseA - os.lstat(): CreateFileA - os.getcwdb(): getcwd() - os.mkdir(): CreateDirectoryA - os.chmod(): SetFileAttributesA - ...

This seems way too broad.

I changed my mind about this list: I only want to change how filenames are encoded, not how filenames are decoded. So only os.listdir() & os.getcwdb() should be changed, as I wrote in another email in this thread and in the issue #13247.

- os.getcwdb(): This you might change.

Issue #13247 combines os.getcwdb() and os.listdir(). Read the issue for more information.

It ('?') is a bad choice of signal though, given the other uses of '?' in paths.

If I understood correctly, '?' is a pattern to match any character in FindFirstFile/FindNextFile. Python cannot configure the replacement character, it's hardcoded to "?" (U+003F).

it's just standard Windows behavior, which results in pathnames that are perfectly acceptable to Windows APIs, but unreliable in use because they have different semantics in different Windows APIs.

I think that such filenames cannot be used with any Windows function accessing to the filesystem. Extract of the issue:

"Such filenames cannot be used, open() fails with OSError(22, "invalid argument: '?'") for example."

You can only be used if you want to display the content of a directory, but don't expect to be able to read file content.

--

Anyway, you must use Unicode on Windows! The bytes API was just kept for backward compatibility.

Victor



More information about the Python-Dev mailing list