[Python-Dev] urllib.quote and unquote - Unicode issues (original) (raw)
Bill Janssen janssen at parc.com
Thu Jul 31 09:39:29 CEST 2008
- Previous message: [Python-Dev] urllib.quote and unquote - Unicode issues
- Next message: [Python-Dev] critical issues for 2.6 and 3.0
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Guido says:
> Actually, we'd need to look at the various other APIs in Py3k before we can > decide whether these should be considered taking or returning bytes or text. > It looks like all other APIs in the Py3k version of urllib treat URLs as > text.
Yes, as I said in the bug tracker, I've groveled over the entire stdlib to see how my patch affects the behaviour of dependent code. Aside from a few minor bits which assumed octets (and did their own encoding/decoding) (which I fixed), all the code assumes strings and is very happy to go on assuming this, as long as the URIs are encoded with UTF-8, which they almost certainly are.
I'm not sure that's sufficient review, though I agree it's necessary. The major consumers of quote/unquote are not in the Python standard library.
(quote will accept either type, while unquote will output a str, there will be a new function unquotetobytes which outputs a bytes - is everyone happy with that?)
No, so don't ask.
Bill
- Previous message: [Python-Dev] urllib.quote and unquote - Unicode issues
- Next message: [Python-Dev] critical issues for 2.6 and 3.0
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]