[Python-3000] Unicode strings, identifiers, and import (original) (raw)

Jason Orendorff jason.orendorff at gmail.com
Mon May 14 17:42:24 CEST 2007


On 5/14/07, Guido van Rossum <guido at python.org> wrote:

Isn't normalization also going to be an issue with using non-ASCII in general? Does it mean that Python will have to use a normalization before comparing identifiers as equal? That's terrible, as it will vastly increase the amount needed to hash a string, too.

PEP 3131 addresses this. The tokenizer would normalize identifier tokens to NFC. Because this happens so early, the rest of Python would be unaffected.

-j



More information about the Python-3000 mailing list