Issue 34698: urllib.request.Request.set_proxy doesn't (necessarily) replace type (original) (raw)
Not sure if this is a documentation or behavior bug, but... the docs for urllib.request.Request.set_proxy (https://docs.python.org/3/library/urllib.request.html#urllib.request.Request.set_proxy) say
Prepare the request by connecting to a proxy server. The host and type will replace those of the instance, and the instance’s selector will be the original URL given in the constructor.
(Emphasis mine.) In practice, behavior is more nuanced than that:
from urllib.request import Request req = Request('http://hostame:port/some/path') req.host, req.type ('hostame:port', 'http') req.set_proxy('proxy:other-port', 'https') req.host, req.type # So far, so good... ('proxy:other-port', 'https')
req = Request('https://hostame:port/some/path') req.host, req.type ('hostame:port', 'https') req.set_proxy('proxy:other-port', 'http') req.host, req.type # Type doesn't change! ('proxy:other-port', 'https')
Looking at the source (https://github.com/python/cpython/blob/v3.7.0/Lib/urllib/request.py#L397) it's obvious why https is treated specially.
The behavior is consistent with how things worked on py2...
from urllib2 import Request req = Request('http://hostame:port/some/path') req.get_host(), req.get_type() ('hostame:port', 'http') req.set_proxy('proxy:other-port', 'https') req.get_host(), req.get_type() ('proxy:other-port', 'https')
req = Request('https://hostame:port/some/path') req.get_host(), req.get_type() ('hostame:port', 'https') req.set_proxy('proxy:other-port', 'http') req.get_host(), req.get_type() ('proxy:other-port', 'https')
... but only if you're actually inspecting host/type along the way!
from urllib2 import Request req = Request('https://hostame:port/some/path') req.set_proxy('proxy:other-port', 'http') req.get_host(), req.get_type() ('proxy:other-port', 'http')
(FWIW, this came up while porting an application from py2 to py3; there was a unit test expecting that last behavior of proxying a https connection through a http proxy.)