msg68977 - (view) |
Author: Roger Upole (rupole) |
Date: 2008-06-30 00:10 |
The problem seems to stem from this line in IOBinding.py: locale.setlocale(locale.LC_CTYPE, "") From the command prompt: Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import string, locale >>> print repr(string.letters) 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ' >>> locale.setlocale(locale.LC_CTYPE, "") 'English_United States.1252' >>> print repr(string.letters) 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz\x83 \x8a\x8c\x8e\x9a\x9c\x9 e\x9f\xaa\xb5\xba\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9 \xca\xcb\xcc\xcd\xce\xc f\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1 \xe2\xe3\xe 4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5 \xf6\xf8\xf 9\xfa\xfb\xfc\xfd\xfe\xff' >>> |
|
|
msg68984 - (view) |
Author: Martin v. Löwis (loewis) *  |
Date: 2008-06-30 01:15 |
Why do you think string.letters gets corrupted? AFAICT, it's still correct. |
|
|
msg68999 - (view) |
Author: Georg Brandl (georg.brandl) *  |
Date: 2008-06-30 08:26 |
Changing the locale changes string.letters -- that is expected behavior. |
|
|
msg69063 - (view) |
Author: Roger Upole (rupole) |
Date: 2008-07-01 20:06 |
It introduces high characters that cause comparisons to fail under IDLE that succeed from the normal python prompt: Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import string >>> u'a' in string.letters True IDLE 1.2.2 >>> import string >>> u'a' in string.letters Traceback (most recent call last): File "<pyshell#1>", line 1, in u'a' in string.letters UnicodeDecodeError: 'ascii' codec can't decode byte 0x83 in position 52: ordinal not in range(128) Or am I misunderstanding how the locale works with string comparisons ? |
|
|
msg69066 - (view) |
Author: Georg Brandl (georg.brandl) *  |
Date: 2008-07-01 20:15 |
Well, that wouldn't be different if you had set the locale in your prompt. In short, ``u'a' in string.letters`` can never work with any string.letters except the default, English-only one, and therefore is wrong. |
|
|
msg69071 - (view) |
Author: Martin v. Löwis (loewis) *  |
Date: 2008-07-01 20:37 |
As Georg says: you shouldn't be mixing Unicode objects and string objects. It's perfectly valid for string.letters to contain non-ASCII bytes, and it's no surprise that this fails for you. string.letters indeed *does* contain only letters. In any case, testing for letter-ness by using "in string.letters" is not a good idea, as it involves a linear search. I recommend to use u"a".isalpha() instead |
|
|