Unicode Mail List Archive: Re: Apostrophes (was Re: Exemplar Characters) (original) (raw)

Next message: Kenneth Whistler: "Re: Apostrophes (was Re: Exemplar Characters)"


Chris asked:

> That’s not what I was talking about at all. It should not matter what the
> value of ’ in Breton or Mohawk is, nor did I ever say that Breton has a
> glottal stop.
>
> If I may, I’d like to rephrase my question.
>
> Language X has the following alphabet:
>
> a h i k n p r t u x y ’
>
> Point 1: It doesn’t matter what the phonetic realisations of these are to
> assign a Unicode codepoint. We know that Latin Script a is U+0061
> regardless of how it’s pronounced.

Correct.

>
> Point 2: We have evidence from Breton that U+2019 is used as part of an
> alphabetic letter, instead of just punctuation.

Correct.

>
> a is U+0061
> h is U+0068
> ...
> ’ is what?
>
> We could choose U+2019 or we could choose U+02BC. Which one is best?
>
> I hope this question makes sense.

It makes sense, but it doesn't have a determinant answer. Either
one could be the best, depending on the orthographic tradition,
its use with other languages (with which it might need to share
letters and keyboards, for example, as in the French/Breton case),
or other concerns.

The correct answer might even be U+0027 APOSTROPHE.

The issue of which *character* an orthography should standardize
on is an issue for the standardizers of that orthography (if they
exist). The apostrophe in particular will always be an inherently
problematical edge case, because it has been used in so many ways,
has never graduated to bona fide Latin letter status, overlaps with
punctuation uses of similar signs, and now has at least 3 forms to
choose from in Unicode.

U+0027 is weighted towards ASCII compatibility
U+02BC is weighted towards ease of word selection
U+2019 avoids glyph ambiguity, and is more available for input than U+02BC

You just have to take some bad with the good for each, and make a
choice.

By the way, for Mohawk in particular, I think groups like these
are the appropriate ones to be deciding:

http://www.edu.gov.on.ca/eng/training/literacy/mohawk/mohawk.html
http://www.edu.gov.on.ca/eng/training/literacy/mohawk/mohawk1.html

(The second link is *in* Mohawk.)

As of 1993, the correct answer there was U+0027. I don't know where
things stand now, or if materials have changed. Clearly, the *easiest*
thing to do to get Mohawk materials online is use U+0027, regardless
of any ambiguities of form and function for that character.

--Ken



This archive was generated by hypermail 2.1.5: Tue Nov 15 2005 - 17:41:47 CST