[Python-Dev] More data points (original) (raw)

Tim Peters tim.peters at gmail.com
Mon Sep 27 00:33:43 CEST 2004


[Tim]

That's because they never suffered from list's ill-advised documentation effectively blessing mutation while iterating <0.5 wink>.

[Nick]

Ah. Interesting to know. So catching this is recommended when it's feasible?

According to me, but perhaps not according to all. You can work very hard to provide predictable semantics for mutation while iterating, by defining cursor objects that somehow retain sensible guarantees even if the object they point into mutates. In effect, "the current index" is a cursor in this respect when iterating over a list, and the semantics are that "the current index", on each iteration, goes up by one, and is an offset from the start of whatever state the list happens to have at that time. So, e.g., this behavior is guaranteed:

x = range(10) for elt in x: ... x.remove(elt) x [1, 3, 5, 7, 9]

"Guaranteed" doesn't necessarily mean unsurprising, or even useful, though. I do have uses for this behavior, but I'd be happy to give them up.

The "natural" behavior of dicts when mutating while iterating is effectively unexplainable -- it "does whatever it does", based on internal details of the hashed distribution of keys into buckets, and even on the history of insertions (which affects hash collision resolution). I'm glad Python gripes about that now (it didn't always).

It would also be possible, but difficult, to implement "sane" iteration+mutation semantics for dicts. A dict cursor object would need to be aware of which objects had and hadn't already been passed out by the iteration, and would even need to be robust against the dict reorganizing itself completely when it changes size.

It's a lot easier all around to say "if you have to, iterate over a snapshot of the keys". In some cases, we're reduced to saying that with no way to catch violations. ZODB's BTrees are a good example here. People routinely get in trouble by mutating them while iterating over them, but the implementation is such that it would be very difficult to detect such a thing.



More information about the Python-Dev mailing list