[Python-Dev] PEP 393 Summer of Code Project (original) (raw)
Guido van Rossum guido at python.org
Thu Sep 1 18:31:53 CEST 2011
- Previous message: [Python-Dev] PEP 393 Summer of Code Project
- Next message: [Python-Dev] PEP 393 Summer of Code Project
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Thu, Sep 1, 2011 at 9:03 AM, Antoine Pitrou <solipsis at pitrou.net> wrote:
Le jeudi 01 septembre 2011 à 08:45 -0700, Guido van Rossum a écrit :
This is definitely thought of as a separate mark added to the e; ë is not a new letter. I have a feeling it's the same way for the French and Germans, but I really don't know. (Antoine? Georg?) Indeed, they are not separate "letters" (they are considered the same in lexicographic order, and the French alphabet has 26 letters). But I'm not sure how it's relevant, because you can't remove an accent without most likely making a spelling error, or at least changing the meaning. Accents are very much part of the language (while ligatures like "ff" are not, they are a rendering detail). So I would consider "é", "ê", "ù", etc. atomic characters for the purpose of processing French text. And I don't see how a decomposed form could help an application.
The example given was someone who didn't agree with how a particular font rendered those accented characters. I agree that's obscure though.
I recall long ago that when the french wrote words in all caps they would drop the accents, e.g. ECOLE. I even recall (through the mists of time) observing this in Paris on public signs. Is this still the convention? Maybe it only was a compromise in the time of Morse code?
-- --Guido van Rossum (python.org/~guido)
- Previous message: [Python-Dev] PEP 393 Summer of Code Project
- Next message: [Python-Dev] PEP 393 Summer of Code Project
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]