[Python-Dev] how to debug httplib slowness (original) (raw)

Chris Withers chris at simplistix.co.uk
Fri Sep 4 17:02:39 CEST 2009


Simon Cross wrote:

Well, since the source for readchunked includes the comment

# XXX This accumulates chunks by repeated string concatenation, # which is not efficient as the number or size of chunks gets big. you might gain some speed improvement with minimal effort by gathering the read data chunks into a list and then returning "".join(chunks) at the end.

True, I'll be trying that and reporting back, but, more interestingly, I did some analysis with wireshark (only 200MB-odd of .pcap logs that was fun ;-) to see the differences in the http conversation and noticed more interestingness...

So, httplib does this:

GET / HTTP/1.1 Host: Accept-Encoding: identity Authorization: Basic

HTTP/1.1 200 OK Date: Fri, 04 Sep 2009 11:44:22 GMT Server: Apache-Coyote/1.1 ContentLength: 116245504 Content-Type: application/vnd.excel Transfer-Encoding: chunked

While wget does this:

<snip 401 conversation> GET / HTTP/1.0 User-Agent: Wget/1.11.4 Accept: / Host: Connection: Keep-Alive Authorization: Basic

HTTP/1.1 200 OK Date: Fri, 04 Sep 2009 11:35:19 GMT Server: Apache-Coyote/1.1 ContentLength: 116245504 Content-Type: application/vnd.excel Connection: close

Interesting points:

cheers,

Chris

-- Simplistix - Content Management, Batch Processing & Python Consulting - http://www.simplistix.co.uk



More information about the Python-Dev mailing list