[Python-Dev] OrderedDict.values() behavior for modified instance (original) (raw)
Nikolaus Rath Nikolaus at rath.org
Sat Oct 26 05:22:53 CEST 2013
- Previous message: [Python-Dev] Merge codeaccess.txt from wesnoth-contribcommunity project in Google Code Hosting
- Next message: [Python-Dev] OrderedDict.values() behavior for modified instance
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hello,
The documentation says the following about modifying a dict while iterating through its view:
| Iterating views while adding or deleting entries in the dictionary may | raise a RuntimeError or fail to iterate over all entries. (http://docs.python.org/3/library/stdtypes.html#dict-views)
The OrderedDict documentation doesn't have anything to say on the subject. In practice, its implementation actually mostly correponds to naive expectations:
from collections import OrderedDict d = OrderedDict() for i in range(5): ... d[i] = i ... i = iter(d.values()) next(i) 0 del d[0] next(i) 1 del d[2] next(i) 3 d.movetoend(1) next(i) 4 next(i) 1
etc.
However, there is one case that results in a rather confusing error:
a = OrderedDict() a[0] = 0 a[1] = 1 a[2] = 2 i = iter(a.values()) next(i) 0 del a[0] del a[1] next(i) Traceback (most recent call last): File "", line 1, in File "/usr/lib/python3.3/collections/abc.py", line 495, in iter yield self._mapping[key] KeyError: 1
What happens here is that a[0] is returned from the linked list, but it still contains links to a[1]. When a[1] is deleted, the links of its predecessor and successor are updated, but a[0] still points to a[1]. The OrderedList then hands a non-existing key to the values() implementation in collections.abc.
The result is not only mightily confusing, but it is also not covered by the documentation (KeyError isn't a RuntimeError).
I would very much like to see this fixed, but I'm not sure if there's a good way to do that.
I see the following options:
(a) When deleting an element from an OrderedList, update the pointers in the corresponding linked list element to the end of the list. Then removing the currently "active" element during the iteration would automatically end the iteration.
For that, we'd just have to add two lines to __delitem__:
def __delitem__(self, key, dict_delitem=dict.__delitem__):
dict_delitem(self, key)
link = self.__map.pop(key)
link_prev = link.prev
link_next = link.next
link_prev.next = link_next
link_next.prev = link_prev
link.prev = root # new
link.next = root # new
The new behavior is explicitly covered by the documentation
(changing the dict may result in incomplete iteration), but it's a
change to what happened before.
(b) When iterating through an OrderedDict, check that the next element is actually still in the hash, i.e. change
def __iter__(self):
root = self.__root
curr = root.next
while curr is not root:
yield curr.key
curr = curr.next
to
def __iter__(self):
root = self.__root
curr = root.next
while curr is not root and curr.key in self:
yield curr.key
curr = curr.next
that would terminate the iteration only in the special case, but
incur an extra dict lookup at every iteration.
Alternatively, one could try very hard to not stop the iteration:
while curr is not root:
yield curr.key
while curr is not root:
curr = curr.next
if curr.key in self:
break
(c) Catch the KeyError in values(), and re-raise the proper exception in class ValuesView:
def __iter__(self):
for key in self._mapping:
try:
yield self._mapping[key]
except KeyError:
raise RuntimeError("underlying Mapping instance modified during iteration")
I consider this a bit ugly, because it may mask other problems in a
custom Mapping implementation (that may raise a KeyError because of
a bug in the Mapping implementation itself).
(d) Extend the OrderedDict documentation to explicity document this case.
This has the drawback that it would probably be nicer to just have
the behavior be consistent and correspond to intuitive expectations.
Would any of those fixes be acceptable? Or is there an even better option?
Best, Nikolaus
-- Encrypted emails preferred. PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C
»Time flies like an arrow, fruit flies like a Banana.«
- Previous message: [Python-Dev] Merge codeaccess.txt from wesnoth-contribcommunity project in Google Code Hosting
- Next message: [Python-Dev] OrderedDict.values() behavior for modified instance
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]