[Python-Dev] Re: [I18n-sig] Unicode strings: an alternative (original) (raw)

Tom Emerson tree@basistech.com
Fri, 5 May 2000 08:34:35 -0400 (EDT)


Just van Rossum writes:

Good point. All this taken together still means to me that comparisons between wide and narrow strings should take place at the character level, which implies that coercion from narrow to wide is done at the character level, without looking at the encoding. (Which in my book in turn still implies that as long as we're talking about Unicode, narrow strings are effectively Latin-1.)

Only true if "wide" strings are encoded in UCS-2 or UCS-4. If "wide characters" are Unicode, but stored in UTF-8 encoding, then you loose.

Hmmmm... how often do you expect to compare narrow vs. wide strings, using default comparison (i.e. = or !=)? What if I'm using Latin 3 and use the byte comparison? I may very well have two strings (one narrow, one wide) that compare equal, even though they're not. Not exactly what I would expect.

 -tree

[I'm flying from Seattle to Boston today, so eventually I will disappear for a while]

-- Tom Emerson Basis Technology Corp. Language Hacker http://www.basistech.com "Beware the lollipop of mediocrity: lick it once and you suck forever"