Issue 8801: Inconsistency in behaviour of urllib and urllib2 with file:// URLs (original) (raw)

I encountered what seems like an incompatibility between urllib and urllib2 in the way they handle file:// URLs. Here's a console session to illustrate:

vinay eta-karmic:/tmp$ echo Hello, world! >hello.txt vinay eta-karmic:/tmp$ cat hello.txt Hello, world! vinay eta-karmic:/tmp$ python Python 2.6.4 (r264:75706, Dec 7 2009, 18:45:15) [GCC 4.4.1] on linux2 Type "help", "copyright", "credits" or "license" for more information.

import urllib,urllib2 s = 'file:////tmp/hello.txt' f1 = urllib.urlopen(s) f1.read() 'Hello, world!\n' f2 = urllib2.urlopen(s) Traceback (most recent call last): File "", line 1, in File "/usr/lib/python2.6/urllib2.py", line 124, in urlopen return _opener.open(url, data, timeout) File "/usr/lib/python2.6/urllib2.py", line 389, in open response = self._open(req, data) File "/usr/lib/python2.6/urllib2.py", line 407, in _open '_open', req) File "/usr/lib/python2.6/urllib2.py", line 367, in _call_chain result = func(*args) File "/usr/lib/python2.6/urllib2.py", line 1240, in file_open return self.parent.open(req) File "/usr/lib/python2.6/urllib2.py", line 389, in open response = self._open(req, data) File "/usr/lib/python2.6/urllib2.py", line 407, in _open '_open', req) File "/usr/lib/python2.6/urllib2.py", line 367, in _call_chain result = func(*args) File "/usr/lib/python2.6/urllib2.py", line 1287, in ftp_open raise URLError('ftp error: no host given') urllib2.URLError:

The problem appears to be that urllib allows a badly-formed file URL (with file://// rather than file:///) when it shouldn't (errors should not pass silently).

There were differing behaviors in the way urllib and urllib2 was handling certain kind of file:// urls which led to this error. I just made them consistent with the fix in r82780 and merged into branches. Now, this Exception won't be thrown at the file-open state, but a addinfo url object will be returned. The opening of an Invalid path will be left to the way OS will handle it, either allow it or reject.