Representing Unix filenames in Unicode (original) (raw)

Next message: Hans Aberg: "Re: Representing Unix filenames in Unicode"


On 28 Nov 2005, at 20:49, Neil Harris wrote:

> The set of ASCII strings is a proper subset of the set of UTF-8
> strings, so no information would need to be stored about which of
> those coding was being used.

So it would seem, but I think that UNIX under some circumstances,
though I do not remember which, needs to know that it is ASCII and
not anything else. But I'll guess, one shall what works best see when
making a UTF-8 enabled UNIX.

> Now, ISO 8859-1, that's a different matter -- I suppose you could
> still use the property that _almost all_ non-pure-ASCII ISO 8859-1
> natural language strings are not also valid UTF-8 strings for
> backwards compatibility, and ditto for most other fixed 8-bit
> encodings, but I certainly wouldn't be willing to trust my
> filesystem to this sort of hack.

I'll pass on this one. There are different approaches, mixed
encodings or single UTF-8, though.

Hans Aberg



This archive was generated by hypermail 2.1.5: Mon Nov 28 2005 - 16:36:25 CST