[Python-Dev] Supporting raw bytes data in urllib.parse.* (was Re: Polymorphic best practices) (original) (raw)

Nick Coghlan ncoghlan at gmail.com
Tue Sep 21 23:35:40 CEST 2010


On Wed, Sep 22, 2010 at 1:57 AM, Ian Bicking <ianb at colorstudy.com> wrote:

All this is unrelated to the question, though -- a separate byte-oriented function won't help any case I can think of.  If the programmer is implementing something like urlparse.urlsplit(userinput.encode(sys.getdefaultencoding())), it's because they want to get bytes out.  So if it's named urlparse.urlsplitbytes() they'll just use that, with the same corruption.  Since bytes and text don't interact well, the choice of bytes in and bytes out will be a deliberate one.  Or, bytes will unintentionally come through, but that will just delay the error a while when the bytes out don't work (e.g., urlparse.urljoin(texturl, urlparse.urlsplit(byteurl).path).  Delaying the error is a little annoying, but a delayed error doesn't lead to mojibake.

Indeed, this line of thinking is what brought me back around to the polymorphic point of view.

Cheers, Nick.

-- Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia



More information about the Python-Dev mailing list