[Python-Dev] str.ascii_lower (original) (raw)
Jeff Epler jepler at unpythonic.net
Mon Dec 29 12:40:08 EST 2003
- Previous message: [Python-Dev] str.ascii_lower
- Next message: [Python-Dev] str.ascii_lower
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Mon, Dec 29, 2003 at 06:24:57PM +0100, Martin v. Loewis wrote:
Looking at python.org/sf/866982, I find it troubling that there are languages where "I".lower() != "i" (for those of you not familiar with Turkish: the lower-case letter of "I" is U+0131, LATIN SMALL LETTER DOTLESS I, which is \xfd in iso-8859-9).
This post caused me to notice the following behavior. Is it "right"?
import locale locale.setlocale(locale.LCCTYPE, "trTR") 'tr_TR' locale.getlocale()[1] # Expected charset 'ISO8859-9' "I".lower() # Expected behavior '\xfd' u"I".lower() # Python bug? (should be u'\u0131') u'i' locale.setlocale(locale.LCCTYPE, "trTR.UTF-8") 'tr_TR.UTF-8' "I".lower() # C library bug? (should be "\xc4\xb1")* 'I' locale.setlocale(locale.LCCTYPE, "enUS.UTF-8") 'en_US.UTF-8' "I".lower() # (UTF-8 locale works properly in english) 'i'
Jeff
- RedHat 9, glibc-2.3.2-11.9
- Previous message: [Python-Dev] str.ascii_lower
- Next message: [Python-Dev] str.ascii_lower
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]