[Python-Dev] Import and unicode: part two (original) (raw)

Victor Stinner victor.stinner at haypocalc.com
Wed Jan 26 10:40:34 CET 2011


Le lundi 24 janvier 2011 à 19:26 -0800, Toshio Kuratomi a écrit :

Why not locale: * Relying on locale is simply not portable. (...) * Mixing of modules from different locales won't work. (...)

I don't understand what you are talking about.

When you import a module, the module name becomes a filename. On Windows, you can reuse the Unicode name directly as a filename. On the other OSes, you have to encode the name to filesystem encoding. During Python 3.2 development, we tried to be able to use a filesystem encoding different than the locale encoding (PYTHONFSENCODING environment variable): but it doesn't work simply because Python is not alone in the OS. Except Python, all programs speak the same "language": the locale encoding. Let's try to give you an example: if create a module with a name encoded to UTF-8, your file browser will display mojibake.

I don't understand the relation between the local filesystem encoding and the portability. I suppose that you are talking about the distribution of a module to other computers. Here the question is how the filenames are stored during the transfer. The user is free to use any tool, and try to find a tool handling Unicode correctly :-) But it's no more the Python problem.

Each computer uses a different locale encoding. You have to use it to cooperate with other programs and avoid mojibake. But I don't understand why you write that "Mixing of modules from different locales won't work". If you use a tool storing filenames in your locale encoding (eg. TAR file format... and sometimes the ZIP format), the problem comes from your tool and you should use another tool.

I created http://bugs.python.org/issue10972 to workaround ZIP tools supposing that ZIP files use the locale encoding instead of cp497: this issue adds an option to force the usage of the Unicode flag (and so store filenames to UTF-8). Even if initially, I created the issue to workaround a bootstrap issue (#10955).

Victor



More information about the Python-Dev mailing list