msg257756 - (view) |
Author: Sergi Almacellas Abellana (pokoli) * |
Date: 2016-01-08 13:36 |
By default, python xmlrpclib parser reads data by chunks of 1024 bytes [1], which leads to a lot of data concatenations when reading large data, which is very slow in python. Increasing the chuck size from 1024 bytes to a higher value makes improve in performance. On the same machine, we have tested with 20MB of data and we've got the following results: Chucks of 1024: 1min 48.933491sec Chucks of 10 * 1024 * 1024: 0.245282sec We have chosen 10 * 1024 * 1024, as it is the same value used in We have done our tests on python2.7, but same code exists for default branch [2] (and 3.x branches also [3][4][5][6]), so I belive all the versions are affected. I can work on a patch if you think this change is acceptable. IMHO it's logical to allow the developer to customize the chunk size instead of using a hard coded one. [1] https://hg.python.org/cpython/file/2.7/Lib/xmlrpclib.py#l1479 [2] https://hg.python.org/cpython/file/default/Lib/xmlrpc/client.py#l1310 [3] https://hg.python.org/cpython/file/3.5/Lib/xmlrpc/client.py#l1310 [4] https://hg.python.org/cpython/file/3.4/Lib/xmlrpc/client.py#l1324 [5] https://hg.python.org/cpython/file/3.3/Lib/xmlrpc/client.py#l1316 [6] https://hg.python.org/cpython/file/3.2/Lib/xmlrpc/client.py#l1301 |
|
|
msg257757 - (view) |
Author: Cédric Krier (ced) * |
Date: 2016-01-08 13:52 |
I don't think it is necessary to allow to customize the chunk size. Indeed Python should provide a good value by default that works for all platforms. |
|
|
msg257759 - (view) |
Author: Sergi Almacellas Abellana (pokoli) * |
Date: 2016-01-08 14:32 |
I attach a patch to fix on default series |
|
|
msg257760 - (view) |
Author: Sergi Almacellas Abellana (pokoli) * |
Date: 2016-01-08 14:32 |
And Also another patch for 2.7 branches |
|
|
msg257764 - (view) |
Author: Sergi Almacellas Abellana (pokoli) * |
Date: 2016-01-08 16:05 |
I added a new patch which fixed comments |
|
|
msg257774 - (view) |
Author: Terry J. Reedy (terry.reedy) *  |
Date: 2016-01-08 19:14 |
I believe performance enhancements are, by default, limited to 'default', but this one might be a good candidate for backport. |
|
|
msg257811 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2016-01-09 09:38 |
Reading with 10 MB limit allocates 10 MB buffer. It may be better to start with 1024 bytes limit, and increase it by 2 times on every iteration. |
|
|
msg257815 - (view) |
Author: Cédric Krier (ced) * |
Date: 2016-01-09 10:00 |
Will it not be better indeed to just stream.read() without any argument? Because HTTPResponse will call _safe_read with just the length of the header. |
|
|
msg257816 - (view) |
Author: Cédric Krier (ced) * |
Date: 2016-01-09 10:26 |
Answering to myself, it is better to read by chunk to feed the parser also by chunk. So here is a version of the patch which increases by 2 on every loop. |
|
|
msg260444 - (view) |
Author: Cédric Krier (ced) * |
Date: 2016-02-18 09:33 |
ping |
|
|
msg260767 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2016-02-24 08:24 |
Could you please provide a microbenchmark that exposes the benefit of this optimization? |
|
|
msg260768 - (view) |
Author: Cédric Krier (ced) * |
Date: 2016-02-24 08:39 |
Is there an infrastructure already in place for such microbenchmark? |
|
|
msg260791 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2016-02-24 11:13 |
There is no special infrastructure. You can just provide two files. One script should implement simple server that produces large output. Other script should implement the client that makes a number of requests and measure the time. Of course you can implement this in one file using multiprocessing if you prefer. |
|
|
msg260970 - (view) |
Author: Cédric Krier (ced) * |
Date: 2016-02-28 09:46 |
Here is the client/server scripts. I don't measure a big performance improvement with it. I think the improvement measured in are linked to the way xmlrpclib is overriden in Tryton. |
|
|
msg260976 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2016-02-28 13:36 |
Thanks Cédric. There is no improvement for the first request. But if repeat requests, there is small (about 5%) improvement. This is too small. |
|
|
msg260978 - (view) |
Author: Cédric Krier (ced) * |
Date: 2016-02-28 13:52 |
One advantage, I see, is when xmlrpclib is overridden to use an other marshaller which is can not be feed chunk by chunk. So reducing the number of call to feed will have a bigger impact. But I don't know if this is enough for Python. |
|
|