[Python-Dev] Python-3.0, unicode, and os.environ (original) (raw)

André Malo nd at perlig.de
Sat Dec 13 05:47:47 CET 2008


On Fri, Dec 12, 2008 at 2:11 AM, André Malo <nd at perlig.de> wrote: > * Adam Olsen wrote: >> UTF-8 in percent encodings is becoming a defacto standard. Otherwise >> the browser has to display the percent escapes in the address bar, >> rather than the intended text. > > Duh! The address bar should contain the URL, which is the intended > text. The escapes are there for a reason. If I pass some octets using > percent escapes via the query string or request body, it's not text, > not even intended. It's still a collection of octets. Translating them > back (and forth when I press enter in the address bar) is a pretty > ambigious operation and therefore pretty wrong. > > The defacto standard does not exist. There's a real one instead: RFC > 2396.

All the heaps of people using non-english wikipedia sites might disagree with you. There's only, what, a few million pages that would be affected?

I'm not sure what you're trying to pull here. Is that supposed to be an argument? There's no page affected at all. It's a browser UI issue, not a page issue.

And even if it were interesting at all, how the URL escapes are displayed in the address bar, those millions of people would favourite KOI8-R or Big 5 over UTF-8 if you would ask them.

Which leads to the exact point: The browser cannot know, nor should it even. It's opaque. The only entity which needs to understand the encoding of URL percent escapes in query or request body is the server selecting the resource.

But I'm sure I'm not telling you any news here.

nd

"Das Verhalten von Gates hatte mir bewiesen, dass ich auf ihn und seine beiden Gefährten nicht zu zählen brauchte" -- Karl May, "Winnetou III"

Im Westen was neues: <http://pub.perlig.de/books.html#apache2>



More information about the Python-Dev mailing list