msg222901 - (view) |
Author: Martin Panter (martin.panter) *  |
Date: 2014-07-13 01:49 |
This isn’t a particularly important problem for me but when reading the code I noticed some bit rot in this function, where a host name in a “file:” URL would be handled differently than intended. * The url[:2] == '//' check is probably wrong because it is comparing the URL’s path component (selector), not the prefix for a host name. Compare urlopen("file://host//") and urlopen("file://host/") error messages. * The req.host is self.get_names() should probably use “in”, not “is”. The code author presumably expected urlopen("file://127.0.0.1//dev/null") to work. * Also it seems odd that urlopen("file://remote/missing") immediately reports “No such file”, while urlopen("file://remote/") blocks for a host name lookup and then reports “not on local host”. |
|
|
msg223095 - (view) |
Author: Senthil Kumaran (orsenthil) *  |
Date: 2014-07-15 08:43 |
Thanks for the report. Point 2 is definitely a bug (and an overlook by me), I will fix it. I think, the url[:2] == '//' check was present for ftp case which supported file:// protocol. I can't see a clear requirement to change here. The scenarios you encounter when giving a two different paths are interesting because, in one the path stat('/invalid/path') and in the other it is the path '/' that is stat and only late is the hostname verified if it coming from localhost (http://hg.python.org/cpython/file/ddfcaeb1c56b/Lib/urllib/request.py#l1353). Since both presence of file locally and then ensuring the host is localhost needs to be done, the current order gives faster failures for most common use-cases. |
|
|
msg223634 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2014-07-22 07:16 |
New changeset 4b98961748f1 by Senthil Kumaran in branch '3.4': Fix localhost checking in FileHandler. Raised in #21970. http://hg.python.org/cpython/rev/4b98961748f1 New changeset 2c660948bb41 by Senthil Kumaran in branch 'default': Merge 3.4 http://hg.python.org/cpython/rev/2c660948bb41 |
|
|
msg223636 - (view) |
Author: Senthil Kumaran (orsenthil) *  |
Date: 2014-07-22 07:19 |
I have addressed the mistake where req.host is self.get_names() was done instead of req.host in self.get_names() in the first commit as it was an obvious problem. I will come up with patch/solution addressing the other behavior mentioned in this report. |
|
|
msg244144 - (view) |
Author: Martin Panter (martin.panter) *  |
Date: 2015-05-27 05:31 |
I do not believe the change fixes anything on its own. It essentially just changed the error message to something even worse, and the added test case already passes without the change. I am posting a patch which cleans up the code and related tests. It also fixes three minor theoretical bugs: >>> urlopen("file://127.0.0.1//dev/null") # Should succeed urllib.error.URLError: >>> urlopen("file://127.1//dev/null") # 127.1 is localhost; should succeed urllib.error.URLError: <urlopen error file:// scheme is supported only on localhost> >>> urlopen("file://remote/missing") # Should not try to access local file urllib.error.URLError: <urlopen error [Errno 2] No such file or directory: '/missing'> The localhost check in file_open() was added by revision 982c8ec73ae3 for Issue 10063. But notice how all the test cases involve strange URLs like file://ftp.example.com//foo.txt, with second extra double slash. I removed the check, because it is redundant with the more thorough and correct check that has existed since 1994 (revision 020d8c2d9d3c) in open_local_file(). For cases like file://remote/nonlocal-file, I think it is better to raise the correct error, even if that requires a DNS check, rather than raising a faster error with a misleading message. So I moved the check before the stat() call inside open_local_file(). |
|
|
msg354100 - (view) |
Author: Maarten ter Huurne (mthuurne) * |
Date: 2019-10-07 14:21 |
Another problem with the current code is that when passed a URL with a host name that is not empty or "localhost", but is one of the alternative names for localhost returned by get_names(), file_open() returns None implicitly instead of opening the file. I think this issue would be fixed by the proposed patch. So it would be good if the patch could be reviewed and adopted. |
|
|