[Python-Dev] PEP 263 considered faulty (for some Japanese) (original) (raw)

Jason Orendorff jason@jorendorff.com
Wed, 13 Mar 2002 06:27:46 -0600


Fredrik Lundh wrote:

which reminds me: the HTTP protocol says that a charset specified at the HTTP protocol level should override any encoding specified in the document itself.

I believe HTTP (RFC 2616) rather meekly asserts that the HTTP Content-Type header always defines the encoding of the body. If no charset is specified, the body is ISO-8859-1.

I believe this requirement is ignored in practice. HTTP servers don't correctly label outgoing documents, and HTTP clients ignore whatever the HTTP server says.

Browsers usually search HTML documents for and XML documents for , and I think they always prefer a document's internal mark to what the HTTP headers say. (Anyone know for sure?)

Just another charset headache.

Jason Orendorff http://www.jorendorff.com/