[Python-Dev] str.ascii_lower (original) (raw)

Guido van Rossum guido at python.org
Mon Dec 29 12:37:37 EST 2003


Looking at python.org/sf/866982, I find it troubling that there are languages where "I".lower() != "i" (for those of you not familiar with Turkish: the lower-case letter of "I" is U+0131, LATIN SMALL LETTER DOTLESS I, which is \xfd in iso-8859-9).

As a solution, I'd like to propose a new method asciilower, which is locale-unaware and only works for bytes 65..90 (returning the byte itself for all other characters). Similarly, asciiupper might be needed "for symmetry"; I don't know whether the symmetry should extend beyond those two. This, in turn, should be used inside the codecs library where encoding names are normalized to lower case. What do you think?

I never though there were locales possible that affected the mappings inside ASCII either.

But shouldnt' this work just as well if it's only for encoding names (which I'd hope would be ASCII themselves):

def ascii_lower(s): return str(unicode(s).lower())

The unicode() call converts ASCII to Unicode, which should always work for encoding names, and the Unicode lower() is locale-independent.

This seems more elegant than adding yet more methods to the str type.

--Guido van Rossum (home page: http://www.python.org/~guido/)



More information about the Python-Dev mailing list