[Python-Dev] Supporting raw bytes data in urllib.parse.* (was Re: Polymorphic best practices) (original) (raw)

Nick Coghlan ncoghlan at gmail.com
Wed Sep 22 12:48:21 CEST 2010

Previous message: [Python-Dev] Supporting raw bytes data in urllib.parse.* (was Re: Polymorphic best practices)
Next message: [Python-Dev] Backup plan: WSGI 1 Addenda and wsgiref update for Py3
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed, Sep 22, 2010 at 9:37 AM, Andrew McNamara <andrewm at object-craft.com.au> wrote:

Yeah, that's the original reasoning that had me leaning towards the parallel API approach. If I seem to be changing my mind a lot in this thread it's because I'm genuinely torn between the desire to make it easier to port existing 2.x code to 3.x by making the current API polymorphic and the fear that doing so will reintroduce some of the exact same bytes/text confusion that the bytes/str split is trying to get rid of. I don't think polymorphic API's do anyone any favours in the long run. My experience of the Py2 email API was that it would give the developer false comfort, only to blow up when the app was in the hands of users, and it didn't seem to matter how careful I was. Py3 has gone the pure/strict route in the core, and I think libs should be consistent with that choice. Developers will have work a little harder, but there will be less surprises.

There's an important distinction here though. Either change I could make to urllib.parse will still result in two distinct APIs. The only question is whether the new bytes->bytes APIs need to have a different spelling or not.

Python 2.x is close to impossible to reliably test in this area because there's no programmatic way to tell the difference between encoded bytes and decoded text. In Python 3, while you can still get yourself in trouble by mixing encodings at the bytes level, you're almost never going to mistake bytes for text unless you go out of your way to support working that way.

The structure of quote/unquote (which already contain implicit decode/encode steps to allow them to consume both bytes and strings with relative abandon and have done since 3.0) may cause us problems in the long run, but polymorphic APIs where the type of the input is the same as the type of the output shouldn't be any more dangerous than if those same APIs used a different spelling to operate on bytes.

Cheers, Nick.

-- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia

Previous message: [Python-Dev] Supporting raw bytes data in urllib.parse.* (was Re: Polymorphic best practices)
Next message: [Python-Dev] Backup plan: WSGI 1 Addenda and wsgiref update for Py3
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list