msg79406 - (view) |
Author: Kristján Valur Jónsson (kristjan.jonsson) *  |
Date: 2009-01-08 10:43 |
This is related to issue 4336. While that issue finally revolved around fixing the Nagle problem for xmlrpc and other http clients, here I propose to fix another performance bottleneck with xmlrpc, reading the HTTP Headers. HTTPResponse creates a socket.fileobject() with zero buffering which means that the readline() operations used to read the headers become very inefficient since individual socket.recv() calls are used for each character. The reason for this is to support users who subsequently do socket.recv() on the underlying socket, and so we don't want to read too much of the headers. Note that there is no example of this in the standard library. In the provided patch, I have removed some vestigial code from xmrlpclib.py which attempted to use recv() even though it never did (because the connection has been closed when HTTPConnection returns the response). Even so, Guido has expressed his concern that we need to keep the support for this recv() behaviour in place. Therefore, I have added a new optional argument, nobuffer=True, which users that promise not to call recv() can set to false. This will then cause the generated fileobject from HTTPResponse to have default buffering, greatly increasing the performance of the reading of the headers. |
|
|
msg79552 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2009-01-10 16:27 |
Why not name the parameter buffering=False rather than nobuffer=True? Sounds more "natural" to me. |
|
|
msg79601 - (view) |
Author: Kristján Valur Jónsson (kristjan.jonsson) *  |
Date: 2009-01-11 16:19 |
Yes, if you think so. But my intention was to indicate that the "nobuffering" is a special feature the user can turn off to resort to what could be considered normal, buffered, behaviour. But either way is fine, and I'll be happy to change it. |
|
|
msg79602 - (view) |
Author: Kristján Valur Jónsson (kristjan.jonsson) *  |
Date: 2009-01-11 16:24 |
Checked in as revision: 68532 |
|
|
msg80064 - (view) |
Author: Benjamin Peterson (benjamin.peterson) *  |
Date: 2009-01-18 00:18 |
Kristján, could you merge this change into the Py3k branch, please? |
|
|
msg80254 - (view) |
Author: Kristján Valur Jónsson (kristjan.jonsson) *  |
Date: 2009-01-20 14:17 |
Adding crossref to http://bugs.python.org/issue4448 regarding the porting of this feature to 3.1 |
|
|
msg80946 - (view) |
Author: Kristján Valur Jónsson (kristjan.jonsson) *  |
Date: 2009-02-02 16:04 |
Checked in r69209 to py3k |
|
|
msg81020 - (view) |
Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) *  |
Date: 2009-02-03 00:34 |
This causes failures in test_urllib2net. The fix is easy: a handful of - self.assertTrue(u.fp._sock.gettimeout() is None) + self.assertTrue(u.fp.raw._sock.gettimeout() is None) But doesn't this show a backward-incompatible change? or is the _sock member an implementation detail anyway? |
|
|
msg81035 - (view) |
Author: Kristján Valur Jónsson (kristjan.jonsson) *  |
Date: 2009-02-03 09:08 |
The socket.socket.makefile() now returns a quite different kind of object, namely a SocketIO thing. This comes as a result of the IO refactoring in 3.0. The good side to this is that old and naughty apps have been forbidden to access the _sock member directly. This was the reason that we had to disable read buffering in 2.x: Some clients would read directly from the socket, even though it was "hidden". As for the test failure, I admit that I didn't enable the "network" resource for the testsuite. Will fix. Btw, test_xmlrpc_net.py fails, because time.xmlrpc.com doesn't resolve. Are there any other good and stable rpc servers out there? |
|
|
msg81036 - (view) |
Author: Kristján Valur Jónsson (kristjan.jonsson) *  |
Date: 2009-02-03 09:14 |
I'm afraid I misunderstood the problem. Yes, since u.fp is now a io.BufferedReader() rather than a socket.Socket (), the _sock member has moved. But looking at the other assertions in test_urllib2net.py, you can see the different ways the code uses to get at the socket: There is u.fp.raw._sock and there is u.fp.fp.raw._sock. IMHO I think we have to assume the _sock member to be an implementation detail that is not part of the interface. |
|
|
msg91597 - (view) |
Author: Gregory P. Smith (gregory.p.smith) *  |
Date: 2009-08-15 06:46 |
I believe there will be a problem with the patch committed in r68532. If getresponse(buffering=True) is called, extra data on the socket may be consumed by the socket.makefile() buffer which will cause problems if the connection is not closed immediately after this response. I mentioned this in comments on http://bugs.python.org/issue2576 which happens to have a way of fixing this already proposed as they are trying to solve an almost identical issue to this without yet having seen this issue. |
|
|
msg91621 - (view) |
Author: Gregory P. Smith (gregory.p.smith) *  |
Date: 2009-08-15 22:39 |
trunk r74463 now forces the HTTPResponse to close afterwards when buffering=True to avoid the issue. |
|
|
msg91683 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2009-08-18 12:42 |
And that update causes failures in test_xmlrpc. |
|
|
msg91684 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2009-08-18 12:47 |
I think I'll open a new ticket instead. |
|
|
msg91914 - (view) |
Author: Kristján Valur Jónsson (kristjan.jonsson) *  |
Date: 2009-08-24 11:34 |
Gregory, please revert your change (74463). There is no "extra data" after the response, since such data can only be generated as a result of a new "request". Your change has disabled the HTTP/1.1 keepalive capability, causing test failurres in xmlrpc. Also showing perhaps that we need test cases for keepalive in regular httplib.py testsuite. If you think there are problems, you a new defect should be created. |
|
|
msg91939 - (view) |
Author: Gregory P. Smith (gregory.p.smith) *  |
Date: 2009-08-24 20:58 |
Already reverted in r74522 | gregory.p.smith |
2009-08-18 22:33:48 -0700 (Tue, 18 Aug 2009) for that reason. |
|
msg235254 - (view) |
Author: Martin Panter (martin.panter) *  |
Date: 2015-02-02 13:03 |
Opened Issue 23377 for the problem of dropping extra buffered data at the end of a response. |
|
|