[Python-3000] Unicode strings, identifiers, and import (original) (raw)

Michael Urman murman at gmail.com
Sun May 13 21:04:48 CEST 2007


This occurred to me while reading the PEP 3131 discussion, and while it's not limited to PEP 3131 concerns, I don't believe I've seen discussed yet elsewhere. What is the interaction between import or import and Unicode module names (or at least Unicode strings describing them). Currently in python 2.5, import appears coerce to str, leading to the following error case:

import(unicodedata.lookup('GREEK SMALL LETTER EPSILON')) Traceback (most recent call last): File "", line 1, in UnicodeEncodeError: 'ascii' codec can't encode character u'\u03b5' in position 0: ordinal not in range(128)

With str being the Unicode type in py3k, this branch of the potential problem needs to be addressed clearly, whether by defining import as converting through ASCII, or by defining a useful semantic. If PEP 3131 is to be accepted, then it should probably address whether import will work on non-ASCII identifiers, and if so what the semantics are (if import would otherwise limit to ASCII).

I'm a little worried on the implementation side, because while on Windows it should be easy to use unicode file APIs, on Linux the filenames may or may be UTF-8 friendly.

Michael

Michael Urman



More information about the Python-3000 mailing list