Issue 1470846: urllib2 ProxyBasicAuthHandler broken (original) (raw)
urllib2.ProxyBasicAuthHandler has been broken since revision 38092 back in December 2004 (unlike the alternative, using a userinfo URL component in the string passed to ProxyHandler, e.g. "joe:password@example.com", which works fine ATM).
There are two problems: First, with a proxy, you're always authenticating yourself for the whole proxy, not just for a specific path. Second, you're authenticating yourself to the proxy, not to the HTTP server. The ProxyBasicAuthHandler subclass dutifully passes in the right thing for the host argument, but AbstractBasicAuthHandler ignores it, which means that it never finds the password -- e.g. if you're trying to connect to http://python.org/dev through myproxy.com, it'll be looking for a username/password for http://python.org/dev instead of the needed myproxy.com.
Since fixing this entails the host argument to http_error_auth_reqed no longer being ignored, HTTPBasicAuthHandler must now pass the full URL, which means AbstractBasicAuthHandler must accept either an authority or a URL. ProxyBasicAuthHandler could also supply a full URL like "http://proxy.example.com/", but the 'host' argument prior to December 2004 was not ignored, and accepted a hostname (!), so we should keep that working rather than insisting on a full URL. I also documented this behaviour.
The patch fixes the bug, adds several new tests, and updates and fixes mis-named method documentation for http_error_auth_reqed, and a typo in the examples.
Note one of the tests in the attached patch relies on a currently non-existent URL at python.org requiring basic authorization (not proxy basic auth in this test, just basic auth). So I guess the code in the test_urllib2net.py patch hunk has to be commented out or something until somebody adds the necessary few lines of Apache config.
Would also be nice to add a functional test for Proxy auth itself. I'm sure python.org doesn't want to be the world's proxy, but something could be configured that does the basic auth dance then responds with 403. What would be most suitable: something like SimpleHTTPServer sitting behind Apache, some mod_python magic...??