[Python-Dev] Bug? http.client assumes iso-8859-1 encoding of HTTP headers (original) (raw)

martin at v.loewis.de martin at v.loewis.de
Sat Jan 4 21:43:57 CET 2014


Quoting Chris Angelico <rosuav at gmail.com>:

I'm pretty sure that server is in violation of the spec, so all bets are off as to what any other server will do. If you know you're dealing with this one server, you can probably hack around this, but I don't think it belongs in core code. Unless, of course, I'm completely wrong about the spec, or if there's a de facto spec that lots of servers follow, in which case maybe it would be worth doing.

It would be possible to support this better by using "ascii" with "surrogateescape" when receiving the redirect, and using the same for all URLs coming into http.client. This would implement a best-effort strategy at preserving the bogus URL, and still maintain the notion that URLs are text (with the other path being to also allow bytes as URLs, and always parsing Location as bytes).

Regards, Martin



More information about the Python-Dev mailing list