[Python-Dev] Finally switch urllib.parse to RFC3986 semantics? (original) (raw)

Guido van Rossum guido at python.org
Fri Mar 18 04:38:42 CET 2011


On Thu, Mar 17, 2011 at 8:19 PM, Senthil Kumaran <orsenthil at gmail.com> wrote:

Nick Coghlan wrote:

> The problem is that it is quite a lot of work to get fully general URI > parsing to work correctly, but the overlap with legacy URL parsing is > large enough that many (most?) use cases in practice work just fine > with the older RFC semantics. Yes. We can have API which strictly confirms to latest RFC by definition, but the problem is there is code out there which 'expects' the parsing behavior remain unchanged so that their existing code does not break. And with parsing behavior unchanged means conforming to older RFC parsing rules. The solution seems to be extra function or an flag in the urlparse method which will exhibit the more latest behavior. Guido wrote:

So would having two different API functions, one legacy and one conforming, be a problem? Ideally the conforming API's name would not be something lame like urllib2 but something timeless. :-) :-) Should blame Jeremy for that name!. But urllib2 is long replaced by urllib.parse, urllib.request and urllib.response. Considering how you remember urllib2, I think it's name has stood the test of time.

It stood out like a sore thumb. :-)

But seriously, I think an additional function or additional flag in the current functions/method in the parse module is sufficient than going for another module.

I vote for a new function, not a flag. (Others can explain my rule of thumb against flag arguments whose values are nearly always constants.)

-- --Guido van Rossum (python.org/~guido)



More information about the Python-Dev mailing list