Issue 16465: dict creation performance regression (original) (raw)

process

Status:	closed	Resolution:	out of date
Dependencies:		Superseder:	use small object allocator for dict key storage View:23601
Assigned To:	serhiy.storchaka	Nosy List:	asvetlov, benjamin.peterson, christian.heimes, jcea, jcon, josh.r, phihag, pingebretson, pitrou, rhettinger, serhiy.storchaka, tim.peters, vstinner
Priority:	normal	Keywords:	3.3regression, patch

Created on 2012-11-13 16:19 by phihag, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
dict_free_key_list.patch	serhiy.storchaka,2012-11-14 13:13	Patch for using the free list for keys	review

Messages (23)
msg175503 - (view)	Author: Philipp Hagemeister (phihag) *	Date: 2012-11-13 16:19
On my system, {} has become significantly slower in 3.3: $ python3.2 -m timeit -n 1000000 '{}' 1000000 loops, best of 3: 0.0314 usec per loop $ python3.3 -m timeit -n 1000000 '{}' 1000000 loops, best of 3: 0.0892 usec per loop $ hg id -i ee7b713fec71+ $ ./python -m timeit -n 1000000 '{}' 1000000 loops, best of 3: 0.0976 usec per loop Is this because of the dict randomization?
msg175504 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2012-11-13 16:43
I confirm that. $ ./python -m timeit -n 1000000 '{};{};{};{};{};{};{};{};{};{}' 2.6: 0.62 usec 2.7: 0.57 usec 3.1: 0.55 usec 3.2: 0.48 usec 3.3: 1.5 usec Randomization is not affecting it.
msg175507 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2012-11-13 17:03
Ah, this is an effect of PEP 412. The difference exists only for creating dicts up to 5 items (inclusive).
msg175511 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2012-11-13 17:34
May be using the free list for keys will restore performance.
msg175528 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2012-11-13 22:32
Does this regression impact any real-world program?
msg175531 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2012-11-14 01:23
> Does this regression impact any real-world program? That is a blow-off response. A huge swath of the language is affected by dictionary performance (keyword args, module lookups, attribute lookup, etc). Most programs will be affected to some degree -- computationally bound programs will notice more and i/o bound programs won't notice at all.
msg175546 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2012-11-14 07:20
> > Does this regression impact any real-world program? > > That is a blow-off response. A huge swath of the language is affected > by dictionary performance (keyword args, module lookups, attribute > lookup, etc). I was merely suggesting to report actual (non-micro) benchmark numbers. This issue is about dict creation, not dict lookups.
msg175551 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2012-11-14 08:46
As I understand, the new dict created on every call of function with keyword arguments. This slow down every such call about 0.1 µsec. This is about 10% of int('42', base=16). In the sum, some programs can slow down to a few percents. Direct comparison of 3.2 and 3.3 is meaningless because of the influence of other factors (first of all PEP 393).
msg175561 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2012-11-14 11:13
> As I understand, the new dict created on every call of function with > keyword arguments. This slow down every such call about 0.1 µsec. > This is about 10% of int('42', base=16). Ok, but `int('42', base=16)` is about the fastest function call with keyword arguments one can think about :-) In other words, the overhead will probably not be noticeable for most function calls.
msg175563 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2012-11-14 11:20
> Ok, but `int('42', base=16)` is about the fastest function call with > keyword arguments one can think about :-) Not as fast as a call without keywords, `int('42', 16)`. :-( But this is a different issue.
msg175567 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2012-11-14 13:13
Here is a really simple patch. It speed up to 1.9x an empty dict creation (still 1.6x slower than 3.2). Make your measurements of real-world programs.
msg175581 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2012-11-14 18:56
Ok, here are some benchmark results with your patch: ### call_method ### Min: 0.313816 -> 0.305150: 1.03x faster Avg: 0.316292 -> 0.305678: 1.03x faster Significant (t=21.07) Stddev: 0.00188 -> 0.00050: 3.7398x smaller Timeline: http://tinyurl.com/couzwbe ### call_method_unknown ### Min: 0.323125 -> 0.312034: 1.04x faster Avg: 0.325697 -> 0.313115: 1.04x faster Significant (t=17.62) Stddev: 0.00258 -> 0.00099: 2.6059x smaller Timeline: http://tinyurl.com/clwacj3 ### call_simple ### Min: 0.215008 -> 0.220773: 1.03x slower Avg: 0.215666 -> 0.221657: 1.03x slower Significant (t=-23.28) Stddev: 0.00062 -> 0.00078: 1.2721x larger Timeline: http://tinyurl.com/cv6gxft ### fastpickle ### Min: 0.554353 -> 0.569768: 1.03x slower Avg: 0.558192 -> 0.574705: 1.03x slower Significant (t=-6.90) Stddev: 0.00363 -> 0.00393: 1.0829x larger Timeline: http://tinyurl.com/c8s2dqk ### float ### Min: 0.292259 -> 0.279563: 1.05x faster Avg: 0.296011 -> 0.286154: 1.03x faster Significant (t=2.99) Stddev: 0.00417 -> 0.00609: 1.4585x larger Timeline: http://tinyurl.com/c34ofe5 ### iterative_count ### Min: 0.115091 -> 0.118527: 1.03x slower Avg: 0.116174 -> 0.119266: 1.03x slower Significant (t=-6.82) Stddev: 0.00068 -> 0.00075: 1.1148x larger Timeline: http://tinyurl.com/bsmuuso ### json_dump_v2 ### Min: 2.591838 -> 2.652666: 1.02x slower Avg: 2.599335 -> 2.687225: 1.03x slower Significant (t=-8.09) Stddev: 0.00715 -> 0.02322: 3.2450x larger Timeline: http://tinyurl.com/clxutkt ### json_load ### Min: 0.378329 -> 0.370132: 1.02x faster Avg: 0.379840 -> 0.371222: 1.02x faster Significant (t=13.68) Stddev: 0.00112 -> 0.00085: 1.3237x smaller Timeline: http://tinyurl.com/c7ycte2 ### nbody ### Min: 0.259945 -> 0.251294: 1.03x faster Avg: 0.261186 -> 0.251843: 1.04x faster Significant (t=14.95) Stddev: 0.00125 -> 0.00062: 2.0274x smaller Timeline: http://tinyurl.com/cg8z6yg ### nqueens ### Min: 0.295304 -> 0.275767: 1.07x faster Avg: 0.298382 -> 0.278839: 1.07x faster Significant (t=10.19) Stddev: 0.00244 -> 0.00353: 1.4429x larger Timeline: http://tinyurl.com/bo2dn4x ### pathlib ### Min: 0.100494 -> 0.102371: 1.02x slower Avg: 0.101787 -> 0.104876: 1.03x slower Significant (t=-8.15) Stddev: 0.00103 -> 0.00159: 1.5432x larger Timeline: http://tinyurl.com/cb9w79t ### richards ### Min: 0.166939 -> 0.172828: 1.04x slower Avg: 0.167945 -> 0.174199: 1.04x slower Significant (t=-9.65) Stddev: 0.00092 -> 0.00112: 1.2241x larger Timeline: http://tinyurl.com/cdros5x ### silent_logging ### Min: 0.070485 -> 0.066857: 1.05x faster Avg: 0.070538 -> 0.067041: 1.05x faster Significant (t=44.64) Stddev: 0.00003 -> 0.00017: 5.4036x larger Timeline: http://tinyurl.com/buz5h9j ### 2to3 ### 0.490925 -> 0.478927: 1.03x faster ### chameleon ### Min: 0.050694 -> 0.049364: 1.03x faster Avg: 0.051030 -> 0.049759: 1.03x faster Significant (t=8.45) Stddev: 0.00041 -> 0.00041: 1.0036x larger Timeline: b'http://tinyurl.com/d5czdjt' ### mako ### Min: 0.183295 -> 0.177191: 1.03x faster Avg: 0.184944 -> 0.179300: 1.03x faster Significant (t=15.54) Stddev: 0.00119 -> 0.00137: 1.1555x larger Timeline: b'http://tinyurl.com/dx6xrvx' In the end, it's mostly a wash.
msg175583 - (view)	Author: Jesús Cea Avión (jcea) *	Date: 2012-11-14 19:44
Antoine, I would consider this a performance regression to solve for 3.3.1. Small dictionary creation is everywhere in CPython.
msg175586 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2012-11-14 20:13
> Antoine, I would consider this a performance regression to solve for > 3.3.1. Small dictionary creation is everywhere in CPython. Again, feel free to provide real-world benchmark numbers proving the regression. I've already posted some figures.
msg175615 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2012-11-15 12:07
> In the end, it's mostly a wash. Yes, it's predictable. The gain is too small and undistinguished on a background of random noise. May be more long-running benchmark will show more reliable results, but in any case, the gain is small. My long-running hard-optimized simulation is about 5% faster on 3.2 or patched 3.4 comparing to 3.3. But this is not a typical program.
msg183211 - (view)	Author: Christian Heimes (christian.heimes) *	Date: 2013-02-28 11:06
The patch gives a measurable speedup (./python is a patched 3.3.0+). IMO we should apply it. It's small and I can see no harm, too. $ for PY in python2.7 python3.2 python3.3 ./python; do cmd="$PY -R -m timeit -n 10000000 '{};{};{};{};{};{};{};{};{};{}'"; echo cmd;evalcmd; eval cmd;evalcmd; done python2.7 -R -m timeit -n 10000000 '{};{};{};{};{};{};{};{};{};{}' 10000000 loops, best of 3: 0.162 usec per loop python3.2 -R -m timeit -n 10000000 '{};{};{};{};{};{};{};{};{};{}' 10000000 loops, best of 3: 0.142 usec per loop python3.3 -R -m timeit -n 10000000 '{};{};{};{};{};{};{};{};{};{}' 10000000 loops, best of 3: 0.669 usec per loop ./python -R -m timeit -n 10000000 '{};{};{};{};{};{};{};{};{};{}' 10000000 loops, best of 3: 0.381 usec per loop $ for PY in python2.7 python3.2 python3.3 ./python; do cmd="$PY -R -m timeit -n 10000000 'int(\"1\", base=16)'"; echo cmd;evalcmd; eval cmd;evalcmd; done python2.7 -R -m timeit -n 10000000 'int("1", base=16)' 10000000 loops, best of 3: 0.268 usec per loop python3.2 -R -m timeit -n 10000000 'int("1", base=16)' 10000000 loops, best of 3: 0.302 usec per loop python3.3 -R -m timeit -n 10000000 'int("1", base=16)' 10000000 loops, best of 3: 0.477 usec per loop ./python -R -m timeit -n 10000000 'int("1", base=16)' 10000000 loops, best of 3: 0.356 usec per loop
msg205362 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2013-12-06 11:44
So what we should do with this?
msg205363 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2013-12-06 11:48
You said it yourself: """The gain is too small and undistinguished on a background of random noise""". Closing would be reasonable, unless someone exhibits a reasonable benchmark where there is a significant difference. In any case, it's too late for perf improvements on 3.4.
msg205397 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2013-12-06 19:35
This patch looks correct and like the right thing to do. I recommend applying it to 3.5.
msg224937 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2014-08-06 15:04
Antoine, are you still oppose to this patch?
msg224939 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2014-08-06 15:26
The patch adds complication to the already complicated memory management of dicts. It increases maintenance burden in critical code. Have we found any case where it makes a tangible difference? (I'm not talking about timeit micro-benchmarks)
msg261565 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2016-03-11 13:02
Closed in the favor of .
msg261576 - (view)	Author: STINNER Victor (vstinner) *	Date: 2016-03-11 15:33
Serhiy Storchaka: > Closed in the favor of . Cool!

History
Date	User	Action	Args
2022-04-11 14:57:38	admin	set	github: 60669
2016-03-11 15:33:52	vstinner	set	messages: +
2016-03-11 13:02:04	serhiy.storchaka	set	status: open -> closedsuperseder: use small object allocator for dict key storagemessages: + resolution: out of datestage: patch review -> resolved
2014-08-06 15:26:16	pitrou	set	nosy: + tim.petersmessages: +
2014-08-06 15:04:34	serhiy.storchaka	set	assignee: serhiy.storchakamessages: +
2014-03-06 22:43:15	josh.r	set	nosy: + josh.r
2014-02-15 06:09:43	pingebretson	set	nosy: + pingebretson
2013-12-06 19:35:10	rhettinger	set	messages: +
2013-12-06 11:48:39	pitrou	set	messages: + versions: + Python 3.5, - Python 3.4
2013-12-06 11:44:43	serhiy.storchaka	set	messages: +
2013-02-28 11:06:08	christian.heimes	set	nosy: + christian.heimesmessages: +
2013-01-09 23:04:19	jcon	set	nosy: + jcon
2012-11-17 14:15:18	asvetlov	set	nosy: + asvetlov
2012-11-16 23:30:29	vstinner	set	nosy: + vstinner
2012-11-15 12:07:28	serhiy.storchaka	set	messages: +
2012-11-14 20:13:21	pitrou	set	messages: +
2012-11-14 19:44:33	jcea	set	messages: +
2012-11-14 18:56:33	pitrou	set	priority: high -> normalmessages: + versions: - Python 3.3
2012-11-14 17:42:59	jcea	set	nosy: + jcea
2012-11-14 13:13:23	serhiy.storchaka	set	files: + dict_free_key_list.patchkeywords: + patchmessages: + stage: patch review
2012-11-14 11:20:07	serhiy.storchaka	set	messages: +
2012-11-14 11:13:32	pitrou	set	messages: +
2012-11-14 08:46:54	serhiy.storchaka	set	messages: +
2012-11-14 07:20:56	pitrou	set	messages: +
2012-11-14 01:23:33	rhettinger	set	priority: normal -> highnosy: + rhettingermessages: +
2012-11-13 22:32:28	pitrou	set	messages: +
2012-11-13 17:34:45	serhiy.storchaka	set	nosy: + pitrou, benjamin.petersonmessages: +
2012-11-13 17:03:28	serhiy.storchaka	set	messages: +
2012-11-13 16:43:59	serhiy.storchaka	set	keywords: + 3.3regressionnosy: + serhiy.storchakamessages: +
2012-11-13 16:19:04	phihag	create