Issue 30904: Python 3 logging HTTPHandler sends duplicate Host header (original) (raw)

The logging HTTPHandler sends two Host headers which confuses certain servers.

Tested versions: Python 3.6.1 lighttpd/1.4.45

Steps to reproduce (MWE):

  1. Set up a lighttpd server which is to act as the logging host (we do not actually implement anything that accepts the input here). Optionally enable the debug settings from https://redmine.lighttpd.net/projects/1/wiki/DebugVariables (specifically, add debug.log-condition-handling = "enable" to /etc/lighttpd/lighttpd.conf) to follow what is happening inside the server.
  2. In python3: import logging import logging.handlers handler = logging.handlers.HTTPHandler('localhost', '/') logging.getLogger().setLevel(logging.INFO) logging.getLogger().addHandler(handler) logging.info('hello world')
  3. Notice that the access logs in /var/log/lighttpd/access.log show a 400 response for the request (in Python, the response is ignored). If the debugging from 1) is enabled, then /var/log/lighttpd/error.log contains a line "duplicate Host-header -> 400".

This is not a bug in lighttpd. The server adheres to RFC7320 (sec. 5.4, p. 44): "A server MUST respond with a 400 (Bad Request) status code [...] to any request message that contains more than one Host header field".

A workaround is to put the full URL in the second argument of HTTPRequest: handler = logging.handlers.HTTPHandler('localhost', 'http://localhost/') Then lighttpd follows RFC2616 (sec. 5.2, p. 37): "If Request-URI is an absoluteURI, the host is part of the Request-URI. Any Host header field value in the request MUST be ignored."

The origin of this issue is that the http.client.HTTPConnection.putrequest method (called by HTTPHandler.emit) already adds a Host header unless skip_host=True is given. Thus the manual addition of a Host header is duplicate.

Other versions like Python 2.7 might also be affected but I did not test.