[Python-Dev] PEP 263 considered faulty (for some Japanese) (original) (raw)

Martin v. Loewis martin@v.loewis.de
13 Mar 2002 18:05:04 +0100


"Stephen J. Turnbull" <stephen@xemacs.org> writes:

I would think that UTF-8 can be quite reliably detected without the "BOM".

There is a difference between auto-detection and declaration. Sure, you can auto-detect UTF-8; you might have to read the entire text for that, though. This is quite different from a declaration: The text either is declared as UTF-8, or it isn't.

Microsoft software for Japanese apparently ignores Content-Type headers and the like in favor of autodetection (probably because the same MS software regularly relies on users to set things like charset parameters in MIME Content-Type).

Auto-detection is useful for displaying content to users. It is evil for a programming language.

Regards, Martin