[Python-3000] Unicode strings, identifiers, and import (original) (raw)

"Martin v. Löwis" martin at v.loewis.de
Thu May 17 15:49:31 CEST 2007

Previous message: [Python-3000] Unicode strings, identifiers, and import
Next message: [Python-3000] Unicode strings, identifiers, and import
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Does the tokenizer do this for all string literals, too? Otherwise you could still get surprises with things like x.foo vs. getattr(x, "foo"), if the name foo were normalized but the string "foo" were not.

No. If you use a string literal, chances are very high that you put NFC into your source code file (if it's not UTF-8, most codecs will produce NFC naturally; if it is UTF-8, it depends on your editor).

If you get the attribute name from elsewhere, it's a design choice of who should perform the normalization. One could specify that builtin getattr does that, or one could require that the application does it in cases where the strings aren't guaranteed to be in NFC.

The only case where I know of a software that explicitly changes the normalization, and not to NFC, is OSX, which uses NFD on disk.

Regards, Martin

Previous message: [Python-3000] Unicode strings, identifiers, and import
Next message: [Python-3000] Unicode strings, identifiers, and import
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-3000 mailing list