Issue 866982: Bad behavior of email.Charset.Charset when locale is tr_TR (original) (raw)

Created on 2003-12-29 10:29 by doko, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
charset.diff doko,2003-12-29 10:29 patch
Messages (4)
msg45079 - (view) Author: Matthias Klose (doko) * (Python committer) Date: 2003-12-29 10:29
Charset class' behaviour is bad when locale is set to tr_TR. The problems's source is input_charset = input_charset.lower() at line 393 of /usr/lib/python2.3/email/Charset.py . This exeample code can reproduce the error: import locale from email.Charset import Charset locale.setlocale(locale.LC_ALL,("tr_TR","ISO-8859-9")) foo = Charset(locale.nl_langinfo(locale.CODESET)) repr(foo) #Returns \xfdso-8859-9 which is not a charset instead of iso-8859-9 The problem exists because the lower() of I in turkish charset is ý (\xfd), not i. I will try to create and submit a patch ASAP.
msg45080 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2003-12-29 17:29
Logged In: YES user_id=21627 See the python-dev discussion. My proposal is to add ascii_lower to <type 'str'>, and use that. Charset.py might then use your code as a fallback. Actually, it might be even more performant to do lower_map = string.maketrans(string.ascii_upper, string.ascii_lower) def _ascii_lower(str): return str.translate(lower_map)
msg45081 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2004-10-09 21:05
Logged In: YES user_id=12800 We should probably do this instead, in Charset.__init__(): input_charset = unicode(input_charset, 'ascii').lower()
msg45082 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2004-10-09 21:08
Logged In: YES user_id=12800 I made this change to Charset.py 1.17.
History
Date User Action Args
2022-04-11 14:56:01 admin set github: 39737
2003-12-29 10:29:45 doko create