[Python-Dev] More on Py3K urllib -- urlencode() (original) (raw)

Dan Mahn dan.mahn at digidescorp.com
Sat Feb 28 21:28:43 CET 2009


Hi. I've been using Py3K successfully for a while now, and have some questions about urlencode().

  1. The docs mention that items sent to urlencode() are quoted using quote_plus(). However, instances of type "bytes" are not handled like they are with quote_plus() because urlencode() converts the parameters to strings first (which then puts a small "b" and single quotes around a textual representation of the bytes). It just seems to me that instances of type "bytes" should be passed directly to quote_plus().
    That would complicate the code just a bit, but would end up being much more intuitive and useful.

  2. If urlencode() relies so heavily on quote_plus(), then why doesn't it include the extra encoding-related parameters that quote_plus() takes?

  3. Regarding the following code fragment in urlencode():

         k = quote_plus(str(k))
        if isinstance(v, str):
             v = quote_plus(v)
             l.append(k + '=' + v)
         elif isinstance(v, str):
             # is there a reasonable way to convert to ASCII?
             # encode generates a string, but "replace" or "ignore"
             # lose information and "strict" can raise UnicodeError
             v = quote_plus(v.encode("ASCII","replace"))
             l.append(k + '=' + v)

I don't understand how the "elif" section is invoked, as it uses the same condition as the "if" section.

Thanks in advance for any thoughts on this issue. I could submit a patch for urlencode() to better explain my ideas if that would be useful.



More information about the Python-Dev mailing list