Issue 16562: Optimize dict equality test (original) (raw)

Created on 2012-11-26 20:26 by rhettinger, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
dict_equal_hash_reuse.patch serhiy.storchaka,2012-11-26 21:26 review
Messages (7)
msg176453 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2012-11-26 20:26
The code for dict_equal() in Objects/dictobject.c currently loops over the key/value pairs in self and uses PyDict_GetItem() to check for the corresponding key/value pair in the other dictionary. This causes an unnecessary call to PyObject_Hash(). Instead, the code should loop over the key/value/hash triplets in self and do a direct lookup in the other dictionary with " ep = (otherdict->ma_lookup)(otherdict, key, hash)". The reuses the known hash value for the key; thereby avoiding the potentially slow call to PyObject_Hash(). See _PyDict_Contains() for an example of how to do a lookup when the hash value is already known. Note, the optimized path should be used only when PyDict_CheckExact() is true.
msg176456 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-11-26 21:26
Here is a simple patch. > Note, the optimized path should be used only when PyDict_CheckExact() is true. Actually this is not needed. dict_equal() uses the same code for dict subclasses.
msg176485 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2012-11-27 19:26
The patch looks good to me.
msg176488 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-11-27 20:10
And here is a synthetic microbenchmark: $ ./python -m timeit -s "n=10**3; k=2; a={(i,)*k:i for i in range(n)}; b={(i,)*k:i for i in range(n)}" "a == b" Vanilla: 251 usec per loop Patched: 195 usec per loop $ ./python -m timeit -s "n=10**3; k=2; a={(i,)*k:i for i in range(n)}; b=dict(a)" "a == b" Vanilla: 116 usec per loop Patched: 58.6 usec per loop The use of tuple keys is quite common.
msg176506 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2012-11-28 02:11
The patch looks correct. If the tests pass, go ahead and apply it.
msg176801 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-12-02 18:11
New changeset d12785ecca72 by Antoine Pitrou in branch 'default': Issue #16562: Optimize dict equality testing. http://hg.python.org/cpython/rev/d12785ecca72
msg176802 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-12-02 18:12
> The patch looks correct. If the tests pass, go ahead and apply it. Done.
History
Date User Action Args
2022-04-11 14:57:38 admin set github: 60766
2012-12-02 18:12:09 pitrou set status: open -> closednosy: + pitroumessages: + resolution: fixedstage: patch review -> resolved
2012-12-02 18:11:43 python-dev set nosy: + python-devmessages: +
2012-11-28 02:11:08 rhettinger set messages: +
2012-11-27 20:54:53 vstinner set nosy: + vstinner
2012-11-27 20:10:28 serhiy.storchaka set messages: +
2012-11-27 19:26:57 loewis set nosy: + loewismessages: +
2012-11-26 21:26:17 serhiy.storchaka set files: + dict_equal_hash_reuse.patchnosy: + serhiy.storchakamessages: + keywords: + patchstage: patch review
2012-11-26 20:40:41 jcea set nosy: + jcea
2012-11-26 20:26:14 rhettinger create