Issue 26049: Poor performance when reading large xmlrpc data (original) (raw)

Created on 2016-01-08 13:36 by pokoli, last changed 2022-04-11 14:58 by admin.

Files
File name	Uploaded	Description	Edit
python27.patch	pokoli,2016-01-08 14:32	review
default.patch	pokoli,2016-01-08 16:05	review
default.patch	ced,2016-01-09 10:26	review
client.py	ced,2016-02-28 09:45
server.py	ced,2016-02-28 09:46

Messages (16)
msg257756 - (view)	Author: Sergi Almacellas Abellana (pokoli) *	Date: 2016-01-08 13:36
By default, python xmlrpclib parser reads data by chunks of 1024 bytes [1], which leads to a lot of data concatenations when reading large data, which is very slow in python. Increasing the chuck size from 1024 bytes to a higher value makes improve in performance. On the same machine, we have tested with 20MB of data and we've got the following results: Chucks of 1024: 1min 48.933491sec Chucks of 10 * 1024 * 1024: 0.245282sec We have chosen 10 * 1024 * 1024, as it is the same value used in We have done our tests on python2.7, but same code exists for default branch [2] (and 3.x branches also [3][4][5][6]), so I belive all the versions are affected. I can work on a patch if you think this change is acceptable. IMHO it's logical to allow the developer to customize the chunk size instead of using a hard coded one. [1] https://hg.python.org/cpython/file/2.7/Lib/xmlrpclib.py#l1479 [2] https://hg.python.org/cpython/file/default/Lib/xmlrpc/client.py#l1310 [3] https://hg.python.org/cpython/file/3.5/Lib/xmlrpc/client.py#l1310 [4] https://hg.python.org/cpython/file/3.4/Lib/xmlrpc/client.py#l1324 [5] https://hg.python.org/cpython/file/3.3/Lib/xmlrpc/client.py#l1316 [6] https://hg.python.org/cpython/file/3.2/Lib/xmlrpc/client.py#l1301
msg257757 - (view)	Author: Cédric Krier (ced) *	Date: 2016-01-08 13:52
I don't think it is necessary to allow to customize the chunk size. Indeed Python should provide a good value by default that works for all platforms.
msg257759 - (view)	Author: Sergi Almacellas Abellana (pokoli) *	Date: 2016-01-08 14:32
I attach a patch to fix on default series
msg257760 - (view)	Author: Sergi Almacellas Abellana (pokoli) *	Date: 2016-01-08 14:32
And Also another patch for 2.7 branches
msg257764 - (view)	Author: Sergi Almacellas Abellana (pokoli) *	Date: 2016-01-08 16:05
I added a new patch which fixed comments
msg257774 - (view)	Author: Terry J. Reedy (terry.reedy) *	Date: 2016-01-08 19:14
I believe performance enhancements are, by default, limited to 'default', but this one might be a good candidate for backport.
msg257811 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2016-01-09 09:38
Reading with 10 MB limit allocates 10 MB buffer. It may be better to start with 1024 bytes limit, and increase it by 2 times on every iteration.
msg257815 - (view)	Author: Cédric Krier (ced) *	Date: 2016-01-09 10:00
Will it not be better indeed to just stream.read() without any argument? Because HTTPResponse will call _safe_read with just the length of the header.
msg257816 - (view)	Author: Cédric Krier (ced) *	Date: 2016-01-09 10:26
Answering to myself, it is better to read by chunk to feed the parser also by chunk. So here is a version of the patch which increases by 2 on every loop.
msg260444 - (view)	Author: Cédric Krier (ced) *	Date: 2016-02-18 09:33
ping
msg260767 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2016-02-24 08:24
Could you please provide a microbenchmark that exposes the benefit of this optimization?
msg260768 - (view)	Author: Cédric Krier (ced) *	Date: 2016-02-24 08:39
Is there an infrastructure already in place for such microbenchmark?
msg260791 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2016-02-24 11:13
There is no special infrastructure. You can just provide two files. One script should implement simple server that produces large output. Other script should implement the client that makes a number of requests and measure the time. Of course you can implement this in one file using multiprocessing if you prefer.
msg260970 - (view)	Author: Cédric Krier (ced) *	Date: 2016-02-28 09:46
Here is the client/server scripts. I don't measure a big performance improvement with it. I think the improvement measured in are linked to the way xmlrpclib is overriden in Tryton.
msg260976 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2016-02-28 13:36
Thanks Cédric. There is no improvement for the first request. But if repeat requests, there is small (about 5%) improvement. This is too small.
msg260978 - (view)	Author: Cédric Krier (ced) *	Date: 2016-02-28 13:52
One advantage, I see, is when xmlrpclib is overridden to use an other marshaller which is can not be feed chunk by chunk. So reducing the number of call to feed will have a bigger impact. But I don't know if this is enough for Python.

History
Date	User	Action	Args
2022-04-11 14:58:25	admin	set	github: 70237
2016-05-18 14:01:38	Oscar Andres Alvarez Montero	set	nosy: + Oscar Andres Alvarez Montero
2016-02-28 13:52:38	ced	set	messages: +
2016-02-28 13:36:43	serhiy.storchaka	set	messages: +
2016-02-28 09:46:30	ced	set	files: + server.pymessages: +
2016-02-28 09:45:02	ced	set	files: + client.py
2016-02-24 11:13:19	serhiy.storchaka	set	messages: +
2016-02-24 08:39:21	ced	set	messages: +
2016-02-24 08:24:07	serhiy.storchaka	set	messages: +
2016-02-18 09:33:44	ced	set	messages: +
2016-01-22 09:33:07	resteve	set	nosy: + resteve
2016-01-16 18:46:38	serhiy.storchaka	set	stage: patch review
2016-01-09 10:26:45	ced	set	files: + default.patchmessages: +
2016-01-09 10:00:35	ced	set	messages: +
2016-01-09 09:38:04	serhiy.storchaka	set	nosy: + serhiy.storchakamessages: +
2016-01-08 19:14:55	terry.reedy	set	nosy: + loewis, terry.reedymessages: + versions: + Python 3.6
2016-01-08 16:05:49	pokoli	set	files: - default.patch
2016-01-08 16:05:39	pokoli	set	files: + default.patchmessages: +
2016-01-08 14:32:54	pokoli	set	files: + python27.patchmessages: +
2016-01-08 14:32:35	pokoli	set	files: + default.patchkeywords: + patchmessages: +
2016-01-08 13:52:28	ced	set	type: performancemessages: + nosy: + ced
2016-01-08 13:36:38	pokoli	create