[Python-Dev] [Python-checkins] r83893 - python/branches/release27-maint/Misc/ACKS (original) (raw)

Alexander Belopolsky alexander.belopolsky at gmail.com
Tue Aug 10 17:47:15 CEST 2010


On Tue, Aug 10, 2010 at 1:53 AM, "Martin v. Löwis" <martin at v.loewis.de> wrote: ..

People need to recognize that any kind of reference is really irrelevant here. There is no "right" order that is better than any other "right" order. I'd personally object to any English language dictionary telling me how my name sorts in the alphabet. Even when an English language dictionary follows German rules? :-) BTW, I did quietly bring Peter Åstrand back to the end of the list yesterday and I agree that this is rather unimportant.

(and yes, I do think it's "wrong" that it got sorted after Lyngvig - in Germany, we put the ö as if it was "oe" - unlike the Swedes, which put the very same letter after the rest of the alphabet. So the ö in Chrigström sorts in a different way than the ö in Löwis. If I move to Sweden, the file would have to change :-)

I did search the mail archives for the discussion of Å's sorting order and now I think that the reference to Swedish rules is an ex-post rationalization. It looks like the original order was by Latin-1 code point and that explains both Å and ö positions. (I actually believe that the Swedish rules are fairly modern as well. Unlike other nations, Swedes don't mind breaking with traditions for modern conveniences. As far as I know, Sweden is the only nation where polite "you" (plural) was abolished by a language reform.)

I raised this issue after one of my early check-ins got a response that acknowledgments should be alphabetized rather than added at the end of the list. [1] I pointed out that given that the file is encoded in UTF-8, it can potentially have last names starting with any unicode character and I was not familiar with any formal procedure that would define an alphabetic order in this case. A short brainstorming session on IRC and the tracker resulted with an agreement that no formal rule exists and the best we can do is to define the order as "rough".

I am not 100% happy with this because I am sure people will keep discovering that the order in the file does not match the order suggested by their favorite sort program. I was also hoping to learn from this discussion what the state of the art in in sorting unicode words is. I believe this issue is addressed by some obscure parts of the unicode standard, but I am not familiar with them.

[1] http://mail.python.org/pipermail/python-checkins/2010-May/093650.html



More information about the Python-Dev mailing list