[Python-Dev] urllib.quote and unquote - Unicode issues (original) (raw)

Bill Janssen janssen at parc.com
Wed Jul 30 18:52:26 CEST 2008

Previous message: [Python-Dev] urllib.quote and unquote - Unicode issues
Next message: [Python-Dev] urllib.quote and unquote - Unicode issues
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed, Jul 30, 2008 at 8:09 AM, André Malo <nd at perlig.de> wrote: > I'm actually in favour of encoding bytes only back and forth. A useful > extension would be another function which wraps quote/unquote and encod= es > and decodes characters.

I'd reverse this. By all means, add a new pair of functions that is bytes in / bytes out. But keep the existing functions purely string in / string out, hardcoded to UTF-8. People wanting another encoding can use the bytes functions and explicit encode / decode calls.

Actually (as I pointed out before) the existing functions are not string-in/string-out. They are something-in and bytes-out. just look like string-in/string-out because of the confusion between byte strings and Unicode strings in Python 1 and 2.

Look, Matt's suggestion is a degradation of the integrity of the stdlib, because it enthrones a broken understanding, a misreading of the RFC, in a very prominent place. I'd prefer not to have Python contribute to that breakage. Keep the functions the way they are now: bytes-in and bytes-out.

Bill

Previous message: [Python-Dev] urllib.quote and unquote - Unicode issues
Next message: [Python-Dev] urllib.quote and unquote - Unicode issues
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list