===================================================== BUG: run : urlparse.urlparse('http://google.com]') then: raise ValueError("Invalid IPv6 URL") ===================================================== SOURCE: if url[:2] == '//': netloc, url = _splitnetloc(url, 2) if (('[' in netloc and ']' not in netloc) or (']' in netloc and '[' not in netloc)): raise ValueError("Invalid IPv6 URL") ===================================================== SOLUTION: I THINK IT IS BETTER TO JUST REMOVE THE LAST 3 LINES ABOVE
Why? I think this is the right behaviour. According to the rfc[1], square brackets are used and only used to refer IPv6 address in URI. Square brackets are reserved characters and the URI you give is not correct. 1. http://tools.ietf.org/html/rfc3986#section-3
I wish you could think twice if you hadn't use urlparse.py in practical project. 1. Do you like the module to raise an exception? 2. The href in webpage is always standard format? 3. Should the parse module verify the ipv6 url format? If so, did the module really make it? 4. Personally, Given a wrong formated url, It is the responsibility of the module to correct it ?
As a general purpose library for url parsing, I think conforming to the existing standard is a good choice. 'http://google.com]' is a malformed URI according to the standard and then I think raising an exception is quite suitable. Of course there are always malformed links in webpages but how to correct them is quite objective. I think catch the exception in application and correct them in your own logic is what you should do.
This behaviour exists exactly because the return value also contains the `.hostname`, which for the IPv6 addresses is *without* brackets: >>> urlparse('http://[::1]:80/').hostname '::1' There is no way to get a proper parsing result from such a broken URI.
> 4. Personally, Given a wrong formated url, It is the responsibility of the module to correct it ? It's not the responsibility of the library to correct (or make a guess on) user input.
History
Date
User
Action
Args
2022-04-11 14:58:31
admin
set
github: 71276
2016-05-23 09:30:40
berker.peksag
set
status: open -> closednosy: + berker.peksagmessages: + resolution: not a bugstage: resolved