Issue 16562: Optimize dict equality test (original) (raw)

Created on 2012-11-26 20:26 by rhettinger, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
dict_equal_hash_reuse.patch	serhiy.storchaka,2012-11-26 21:26	review

Messages (7)
msg176453 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2012-11-26 20:26
The code for dict_equal() in Objects/dictobject.c currently loops over the key/value pairs in self and uses PyDict_GetItem() to check for the corresponding key/value pair in the other dictionary. This causes an unnecessary call to PyObject_Hash(). Instead, the code should loop over the key/value/hash triplets in self and do a direct lookup in the other dictionary with " ep = (otherdict->ma_lookup)(otherdict, key, hash)". The reuses the known hash value for the key; thereby avoiding the potentially slow call to PyObject_Hash(). See _PyDict_Contains() for an example of how to do a lookup when the hash value is already known. Note, the optimized path should be used only when PyDict_CheckExact() is true.
msg176456 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2012-11-26 21:26
Here is a simple patch. > Note, the optimized path should be used only when PyDict_CheckExact() is true. Actually this is not needed. dict_equal() uses the same code for dict subclasses.
msg176485 - (view)	Author: Martin v. Löwis (loewis) *	Date: 2012-11-27 19:26
The patch looks good to me.
msg176488 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2012-11-27 20:10
And here is a synthetic microbenchmark: $ ./python -m timeit -s "n=10*3; k=2; a={(i,)k:i for i in range(n)}; b={(i,)k:i for i in range(n)}" "a == b" Vanilla: 251 usec per loop Patched: 195 usec per loop $ ./python -m timeit -s "n=103; k=2; a={(i,)k:i for i in range(n)}; b=dict(a)" "a == b" Vanilla: 116 usec per loop Patched: 58.6 usec per loop The use of tuple keys is quite common.
msg176506 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2012-11-28 02:11
The patch looks correct. If the tests pass, go ahead and apply it.
msg176801 - (view)	Author: Roundup Robot (python-dev)	Date: 2012-12-02 18:11
New changeset d12785ecca72 by Antoine Pitrou in branch 'default': Issue #16562: Optimize dict equality testing. http://hg.python.org/cpython/rev/d12785ecca72
msg176802 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2012-12-02 18:12
> The patch looks correct. If the tests pass, go ahead and apply it. Done.

History
Date	User	Action	Args
2022-04-11 14:57:38	admin	set	github: 60766
2012-12-02 18:12:09	pitrou	set	status: open -> closednosy: + pitroumessages: + resolution: fixedstage: patch review -> resolved
2012-12-02 18:11:43	python-dev	set	nosy: + python-devmessages: +
2012-11-28 02:11:08	rhettinger	set	messages: +
2012-11-27 20:54:53	vstinner	set	nosy: + vstinner
2012-11-27 20:10:28	serhiy.storchaka	set	messages: +
2012-11-27 19:26:57	loewis	set	nosy: + loewismessages: +
2012-11-26 21:26:17	serhiy.storchaka	set	files: + dict_equal_hash_reuse.patchnosy: + serhiy.storchakamessages: + keywords: + patchstage: patch review
2012-11-26 20:40:41	jcea	set	nosy: + jcea
2012-11-26 20:26:14	rhettinger	create