[Python-3000] Unicode strings, identifiers, and import (original) (raw)

"Martin v. Löwis" martin at v.loewis.de
Fri May 18 07:26:09 CEST 2007


The answer to all of this is the filesystem encoding, which is already supported. Doesn't appear particularly difficult to me. sys.getfilesystemencoding() is None on most Linux computers I have access to.

That's strange. Is LANG not set?

How is the problem solved there?

A default needs to be applied. In 2.x, the default is the system encoding. Not sure whether the notion of a Python system encoding will be preserved for 3.x, but it should be safe, on Unix, to default to UTF-8 for the file system encoding unless LANG specifies something different.

In fact, I have a question about this. Can anybody show me a valid multi-platform Python code snippet that, given a filename as unicode string, create a file with that name, possibly adjusting the name so to ignore an encoding problem (so that the function always succeed)?

That's not really a python-dev or py3k question. If you want to support arbitrary Unicode strings, you clearly cannot map them to file names directly: what if the Unicode string contains the directory separator, or other characters not allowed in file names (such as : or * on Windows).

If you need to guarantee that any Unicode string can map to a file name, I suggest

f = open(filename.encode("utf-8").encode("hex"), "w")

I attempted this a couple of times without being satisfied at all by the solutions.

That's probably because you failed to specify all requirements that you need for satisfaction. If you would explicitly specify them, you would likely find that they conflict, and that no solution can possibly exist satisfying all your requirements, and that this has nothing to do with Unicode.

Notice that my above solution meets the specified needs: it supports all unicode strings, succeeds always, and possibly adjusts the file name to ignore an encoding problem. Of course, interpreting the file name in a file explorer is somewhat tedious...

Regards, Martin



More information about the Python-3000 mailing list