Issue 19332: Guard against changing dict during iteration (original) (raw)

Created on 2013-10-21 14:02 by serhiy.storchaka, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
dict_mutating_iteration.patch	serhiy.storchaka,2013-10-21 14:02	review
dict_mutating_iteration_2.patch	serhiy.storchaka,2013-10-23 19:46	review

Messages (9)
msg200784 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2013-10-21 14:02
Currently dict iterating is guarded against changing dict's size. However when dict changed during iteration so that it's size left unchanged, this modification left unnoticed. >>> d = dict.fromkeys('abcd') >>> for i in d: ... print(i) ... d[i + 'x'] = None ... del d[i] ... d a dx dxx ax c b In general iterating over mutating dict considered logical error. It is good detect it as early as possible. The proposed patch introduces a counter which changed every time when added or removed key. If an iterator detects that this counter is changed, it raises runtime error.
msg200995 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2013-10-23 04:20
The decision to not monitor adding or removing keys was intentional. It is just not worth the cost in either time or space.
msg201062 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2013-10-23 19:46
In the first patch the counter was placed in the _dictkeysobject structure. In the second place it is placed in the PyDictObject so it now has no memory cost. Access time to new counter for non-modifying operations is same as in current code. The only additional cost is time cost for modifying operations. But modifying operations is usually much rare than non-modifying operations, and the incrementing one field takes only small part of the time needed for all operation. I don't think this will affect total performance of real programs.
msg201065 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2013-10-23 20:10
If there's no performance regression, then this sounds like a reasonable idea. The remaining question would be whether it can break existing code. Perhaps you should ask python-dev?
msg201156 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2013-10-24 16:34
I disagree with adding such unimportant code to the critical path.
msg201780 - (view)	Author: Ethan Furman (ethan.furman) *	Date: 2013-10-30 21:47
Raymond, please don't be so concise. Is the code unimportant because the scenario is so rare, or something else?
msg202262 - (view)	Author: Steven D'Aprano (steven.daprano) *	Date: 2013-11-06 12:32
Duplicate of this: http://bugs.python.org/issue6017
msg202287 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2013-11-06 20:56
A few thoughts: * No existing, working code will benefit from this patch; however, almost all code will pay a price for it -- bigger size for an empty dict and a runtime cost (possibly very small) on the critical path (every time a value is stored in a dict). * The sole benefit of the patch is provide an earlier warning that someone is doing something weird. For most people, this will never come up (we have 23 years of Python history indicating that there isn't a real problem to that needs to be solved). * The normal rule (not just for Python) is that a data structures have undefined behavior for mutating while iterating, unless there is a specific guarantee (for example, we guarantee that the dicts are allowed to mutate values but not keys during iteration and we guarantee the behavior of list iteration while iterating). * It is not clear that other implementations such as IronPython and Jython would be able to implement this behavior (Jython wraps the Java ConcurrentHashMap). * The current patch second guesses a decision that was made long ago to only detect size changes (because it is cheap, doesn't take extra memory, isn't on the critical path, and handles the common case). * The only case whether we truly need a stronger protection is when it is needed to defend against a segfault. That is why collections.deque() implement a change counter. It has a measureable cost that slows down deque operations (increasing the number of memory accesses per append, pop, or next) but it is needed to prevent the iterator from spilling into freed memory.
msg257946 - (view)	Author: Roundup Robot (python-dev)	Date: 2016-01-10 23:43
New changeset a576199a5350 by Victor Stinner in branch 'default': PEP 509 https://hg.python.org/peps/rev/a576199a5350

History
Date	User	Action	Args
2022-04-11 14:57:52	admin	set	github: 63531
2017-02-02 14:38:17	r.david.murray	link	issue29420 superseder
2016-01-10 23:43:55	python-dev	set	nosy: + python-devmessages: +
2013-11-06 20:56:30	rhettinger	set	messages: +
2013-11-06 12:32:40	steven.daprano	set	nosy: - steven.daprano
2013-11-06 12:32:11	steven.daprano	set	nosy: + steven.dapranomessages: +
2013-10-30 21:47:23	ethan.furman	set	nosy: + ethan.furmanmessages: +
2013-10-28 06:15:54	rhettinger	set	status: open -> closedresolution: rejected
2013-10-24 16:34:30	rhettinger	set	messages: +
2013-10-23 20:10:35	pitrou	set	messages: +
2013-10-23 19:58:35	pitrou	set	nosy: + tim.peters
2013-10-23 19:46:02	serhiy.storchaka	set	files: + dict_mutating_iteration_2.patchmessages: +
2013-10-23 04:20:22	rhettinger	set	assignee: rhettingermessages: +
2013-10-21 14:02:54	serhiy.storchaka	create