[Python-Dev] Dicts are broken Was: unicode hell/mixing str and unicode asdictionarykeys (original) (raw)
"Martin v. Löwis" martin at v.loewis.de
Mon Aug 7 15:00:20 CEST 2006
- Previous message: [Python-Dev] Dicts are broken Was: unicode hell/mixing str and unicode asdictionarykeys
- Next message: [Python-Dev] Dicts are broken Was: unicode hell/mixing str and unicode asdictionarykeys
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
M.-A. Lemburg schrieb:
Python just doesn't know the encoding of the 8-bit string, so can't make any assumptions on it. As result, it raises an exception to inform the programmer.
Oh, Python does make an assumption what the encoding is: it assumes it is the system encoding (i.e. "ascii"). Then invoking the ascii codec raises an exception, because the string clearly isn't ascii.
It is well possible that the string uses an encoding where the Unicode string is indeed the equal to the string, assuming this encoding
So what? Python uses the system encoding for this operation. What does it matter that the result would be different if it had used a different encoding.
The strings are unequal under the system encoding; it's irrelevant that they might be equal under a different encoding.
The same holds for the ASCII part (i.e. where you don't get an exception):
py> u"foo" == "sbb" False py> u"foo".encode("rot13") == "sbb" True
So the strings compare as unequal, even though they compare equal if treated as rot13. That doesn't stop Python from considering them unequal.
Regards, Martin
- Previous message: [Python-Dev] Dicts are broken Was: unicode hell/mixing str and unicode asdictionarykeys
- Next message: [Python-Dev] Dicts are broken Was: unicode hell/mixing str and unicode asdictionarykeys
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]