[Python-Dev] Encoding detection in the standard library? (original) (raw)

"Martin v. Löwis" martin at v.loewis.de
Tue Apr 22 20:06:16 CEST 2008

Previous message: [Python-Dev] Encoding detection in the standard library?
Next message: [Python-Dev] Encoding detection in the standard library?
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

When a web browser POSTs data, there is no standard way of communicating which encoding it's using.

That's just not true. Web browser should and do use the encoding of the web page that originally contained the form.

There are some hints which make it easier (accept-charset attributes, the encoding used to send the page to the browser), but no guarantees.

Not true. The latter is guaranteed (unless you assume bugs - but if you do, can you present a specific browser that has that bug?)

Email is a smaller problem, because it usually has a helpful content-type header, but that's no guarantee.

Then assume windows-1252. Mailers who don't use MIME for non-ASCII characters mostly died 10 years ago; those people who continue to use them likely can accept occasional moji-bake (or else they would have switched long ago).

Now, at the moment, the only data I have to support this claim is my experience with DrProject in non-English locations. If I'm the only one who has had these sorts of problems, I'll go back to "Unicode for Dummies".

For web forms, I always encode the pages in UTF-8, and that always works.

For email, I once added encoding processing to the pipermail (the mailman archiver), and that also always works.

I'll go back and take another look at the problem, then come back if new revelations appear.

Good luck!

Martin

Previous message: [Python-Dev] Encoding detection in the standard library?
Next message: [Python-Dev] Encoding detection in the standard library?
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list