[Python-Dev] Can the cgi module be made Unicode-aware? (original) (raw)
Guido van Rossum guido@python.org
Thu, 11 Apr 2002 08:56:26 -0400
- Previous message: [Python-Dev] Can the cgi module be made Unicode-aware?
- Next message: [Python-Dev] Can the cgi module be made Unicode-aware?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
I keep trying to handle various places in my code where I can get input in non-ASCII encodings. Today I realized the cgi module does nothing to translate Unicode data into unicode objects. I see in one instance that I am getting data that is clearly utf-8 encoded, but I see nothing in the CGI script's environment variables to suggest the client web browser told the server how the data was encoded other than the obvious "Content-Type: application/x-www-form-urlencoded". Is utf-8 implied for the data once the url encoding has been reversed?
I very much doubt it. You probably received that UTF-8 data from a non-standard-conforming browser.
Should the cgi module be made Unicode-aware? If so, how? I can never remember the incantation to convert non-ASCII string objects to Unicode objects and nothing I've tried by trial-and-error so far works.
I must be misunderstanding your question, because the answer I'm thinking of is unicode(s,'utf8') and that can't possibly be what you can never remember.
I don't want to adopt the workaround outlined in FAQ question 4.102 (change the default site-wide encoding). Perhaps that question should be extended with more appropriate information about converting raw strings with non-ASCII content to unicode.
(There's also an approach that tries to compare the converted to the unconverted version and catches the exception; if no exception is raised, the input string was pure ASCII and the Unicode conversion is unnecessary.)
--Guido van Rossum (home page: http://www.python.org/~guido/)
- Previous message: [Python-Dev] Can the cgi module be made Unicode-aware?
- Next message: [Python-Dev] Can the cgi module be made Unicode-aware?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]