Issue 21970: Broken code for handling file://host in urllib.request.FileHandler.file_open (original) (raw)

Created on 2014-07-13 01:49 by martin.panter, last changed 2022-04-11 14:58 by admin.

Files
File name Uploaded Description Edit
file-handler.patch martin.panter,2015-05-27 05:31 review
Messages (6)
msg222901 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2014-07-13 01:49
This isn’t a particularly important problem for me but when reading the code I noticed some bit rot in this function, where a host name in a “file:” URL would be handled differently than intended. * The url[:2] == '//' check is probably wrong because it is comparing the URL’s path component (selector), not the prefix for a host name. Compare urlopen("file://host//") and urlopen("file://host/") error messages. * The req.host is self.get_names() should probably use “in”, not “is”. The code author presumably expected urlopen("file://127.0.0.1//dev/null") to work. * Also it seems odd that urlopen("file://remote/missing") immediately reports “No such file”, while urlopen("file://remote/") blocks for a host name lookup and then reports “not on local host”.
msg223095 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2014-07-15 08:43
Thanks for the report. Point 2 is definitely a bug (and an overlook by me), I will fix it. I think, the url[:2] == '//' check was present for ftp case which supported file:// protocol. I can't see a clear requirement to change here. The scenarios you encounter when giving a two different paths are interesting because, in one the path stat('/invalid/path') and in the other it is the path '/' that is stat and only late is the hostname verified if it coming from localhost (http://hg.python.org/cpython/file/ddfcaeb1c56b/Lib/urllib/request.py#l1353). Since both presence of file locally and then ensuring the host is localhost needs to be done, the current order gives faster failures for most common use-cases.
msg223634 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2014-07-22 07:16
New changeset 4b98961748f1 by Senthil Kumaran in branch '3.4': Fix localhost checking in FileHandler. Raised in #21970. http://hg.python.org/cpython/rev/4b98961748f1 New changeset 2c660948bb41 by Senthil Kumaran in branch 'default': Merge 3.4 http://hg.python.org/cpython/rev/2c660948bb41
msg223636 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2014-07-22 07:19
I have addressed the mistake where req.host is self.get_names() was done instead of req.host in self.get_names() in the first commit as it was an obvious problem. I will come up with patch/solution addressing the other behavior mentioned in this report.
msg244144 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-05-27 05:31
I do not believe the change fixes anything on its own. It essentially just changed the error message to something even worse, and the added test case already passes without the change. I am posting a patch which cleans up the code and related tests. It also fixes three minor theoretical bugs: >>> urlopen("file://127.0.0.1//dev/null") # Should succeed urllib.error.URLError: >>> urlopen("file://127.1//dev/null") # 127.1 is localhost; should succeed urllib.error.URLError: <urlopen error file:// scheme is supported only on localhost> >>> urlopen("file://remote/missing") # Should not try to access local file urllib.error.URLError: <urlopen error [Errno 2] No such file or directory: '/missing'> The localhost check in file_open() was added by revision 982c8ec73ae3 for Issue 10063. But notice how all the test cases involve strange URLs like file://ftp.example.com//foo.txt, with second extra double slash. I removed the check, because it is redundant with the more thorough and correct check that has existed since 1994 (revision 020d8c2d9d3c) in open_local_file(). For cases like file://remote/nonlocal-file, I think it is better to raise the correct error, even if that requires a DNS check, rather than raising a faster error with a misleading message. So I moved the check before the stat() call inside open_local_file().
msg354100 - (view) Author: Maarten ter Huurne (mthuurne) * Date: 2019-10-07 14:21
Another problem with the current code is that when passed a URL with a host name that is not empty or "localhost", but is one of the alternative names for localhost returned by get_names(), file_open() returns None implicitly instead of opening the file. I think this issue would be fixed by the proposed patch. So it would be good if the patch could be reviewed and adopted.
History
Date User Action Args
2022-04-11 14:58:05 admin set github: 66169
2019-10-07 14:21:55 mthuurne set nosy: + mthuurnemessages: + versions: + Python 3.7
2015-05-27 05:31:43 martin.panter set files: + file-handler.patchversions: + Python 3.5, Python 3.6messages: + keywords: + patchtype: behaviorstage: patch review
2014-07-22 07:19:37 orsenthil set messages: +
2014-07-22 07:16:28 python-dev set nosy: + python-devmessages: +
2014-07-18 16:14:49 r.david.murray set nosy: + r.david.murray
2014-07-15 08:43:46 orsenthil set nosy: + orsenthilmessages: +
2014-07-13 01:49:21 martin.panter create