[Python-3000] Unicode strings, identifiers, and import (original) (raw)
James Y Knight foom at fuhm.net
Fri May 18 01:24:21 CEST 2007
- Previous message: [Python-3000] Unicode strings, identifiers, and import
- Next message: [Python-3000] Unicode strings, identifiers, and import
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On May 17, 2007, at 7:04 PM, Giovanni Bajo wrote:
On 13/05/2007 21.31, Guido van Rossum wrote:
The answer to all of this is the filesystem encoding, which is already supported. Doesn't appear particularly difficult to me. sys.getfilesystemencoding() is None on most Linux computers I have access to. How is the problem solved there? In fact, I have a question about this. Can anybody show me a valid multi-platform Python code snippet that, given a filename as unicode string, create a file with that name, possibly adjusting the name so to ignore an encoding problem (so that the function always succeed)? def dumptofile(unicodefilename): ...
unicode_filename.encode(sys.getfilesystemencoding() or 'ascii',
'xmlcharrefreplace') would work.
Although I don't think I've seen a platform where
sys.getfilesystemencoding() is None.
If I unset LANG/LANGUAGE/LC_*, python reports 'ANSI_X3.4-1968'. But
normally on my system it reports 'UTF-8', since I have LANG=en_US.UTF-8.
The really tricky thing is that on unix systems, if you want to be
able to access all the files on the disk, you have to use the byte-
string API, as not all filenames are convertible to unicode. But on
windows, if you want to be able to access all the files on the disk,
you CANNOT use the byte-string api, because not all filenames
(which are unicode on disk) are convertible to bytestrings via the
"mbcs" encoding (which is what getfilesystemencoding() reports). It's
quite a pain in the ass really.
James
- Previous message: [Python-3000] Unicode strings, identifiers, and import
- Next message: [Python-3000] Unicode strings, identifiers, and import
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]