During our development we have experience the following: If you have a user in your Windows machine with a name hat uses Japanese characters like “雄鳥お人好し” you will have the following in your system: * The Windows Shell will show the path correctly, that is: “C:\Users\雄鳥お人好し” * cmd.exe will show: “C:\Users\??????” * All the env variables will be wrong, which means they will be similar to the info shown in cmd.exe The above is a problem because the implementation of expanduser in ntpath.py uses the env variables to get expand the path which means that in this case the returned path will be wrong. I have attached a small example of how to get the user profile path (~) on Windows using SHGetFolderPathW or SHGetKnownFolderPathW to fix the issue. PS: I don't know if this issue also occurs on python 3.
On POSIX, Python 3 works correctly if my home dir is /tmp/éric, and Python 2.7 returns a UTF-8-encoded (not locale-encoded!) bytes string. For Windows, a patch would probably need to add a private function to the _nt module (in C): ctypes is too dangerous to be used in the standard library.
Unicode environment vars work properly in Python 3.x on Windows, too, because the convertenviron() function in posixmodule.c uses extern _wenviron PyUnicode_FromWideChar() in Python 3.x. In Python 2.7, convertenviron() uses extern environ and PyString_FromString*().
Python 2 uses byte strings. If characters are not encodable to the ANSI code page, Windows replaces them by question marks. See the issue #13247 for another example (in Python 3 when using explicitly the bytes API). To be able to support characters not encodable to the ANSI code page, you have to use Unicode *everywhere*. Because Python 2 doesn't have access to the Unicode environment and uses bytes in most cases, I don't think that we can fix this issue in Python 2. I close this issue because it would require too much work to fix this issue in Python 2, whereas it already works in Python 3. Move to Python 3 is the best solution of this issue.
nosy: + floxtitle: os.path.expanduser brakes when using unicode character in the username -> os.path.expanduser breaks when using unicode character in the username