[Python-Dev] httplib and bad response chunking (original) (raw)

Greg Ward gward-1337f07a94b43060ff5c1ea922ed93d6 at python.net
Wed Jul 26 04:32:13 CEST 2006


So I accidentally discovered the other day that httplib does not handle a particular type of mangled HTTP response very well. In particular, it tends to blow up with an undocumented ValueError when the server screws up "chunked" encoding. I'm not the first to discover this, either: see http://www.python.org/sf/1486335 .

HTTP 1.1 response chunking allows clients to know how many bytes of response to expect for dynamic content, i.e. when it's not possible to include a "Content-length" header. A chunked response might look like this:

0005\r\nabcd\n\r\n0004\r\nabc\n\r\n0\r\n\r\n

which means: 0x0005 bytes in first chunk, which is "abcd\n" 0x0004 bytes in second chunk, which is "abc\n"

Each chunk size is terminated with "\r\n"; each chunk is terminated with "\r\n"; end of response is indicated by a chunk of 0 bytes, hence the "\r\n\r\n" at the end.

Details in RFC 2616: http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.6.1

Anyways, what I discovered in the wild the other day was a response like this:

0005\r\nabcd\n\r\n0004\r\nabc\n\r\n\r\n

i.e. the chunk-size for the terminating empty chunk was missing. This cause httplib.py to blow up with ValueError because it tried to call

int(line, 16)

assuming that 'line' contained a hex number, when in fact it was the empty string. Oops.

IMHO the minimal fix is to turn ValueError into HTTPException (or a subclass thereof); httplib should not raise ValueError just because some server sends a bad response. (The server in question was Apache 2.0.52 running PHP 4.3.9 sending a big hairy error page because the database was down.)

Where I'm getting hung up is how far to test this stuff. I have discovered other hypothetical cases of bad chunking that cause httplib to go into an infinite loop or block forever on socket.readline(). Should we worry about those cases as well, despite not having seen them happen in the wild? More annoying, I can reproduce the "block forever" case using a real socket, but not using the StringIO-based FakeSocket class in test_httplib.

Anyways, I've cobbled together a crude hack to test_httplib.py that exposes the problem:

http://sourceforge.net/tracker/download.php?group_id=5470&atid=105470&file_id=186245&aid=1486335

Feedback welcome. (Fixing the inadvertent ValueError is trivial, so I'm concentrating on getting the tests right first.)

Oh yeah, my patch is relative to the 2.4 branch.

    Greg

-- Greg Ward <gward at python.net> http://www.gerg.ca/ I don't believe there really IS a GAS SHORTAGE.. I think it's all just a BIG HOAX on the part of the plastic sign salesmen -- to sell more numbers!!



More information about the Python-Dev mailing list