[Python-Dev] Please reconsider PEP 479. (original) (raw)
Mark Shannon mark at hotpy.org
Mon Nov 24 01:25:04 CET 2014
- Previous message: [Python-Dev] Please reconsider PEP 479.
- Next message: [Python-Dev] Please reconsider PEP 479.
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 23/11/14 22:54, Chris Angelico wrote:
On Mon, Nov 24, 2014 at 7:18 AM, Mark Shannon <mark at hotpy.org> wrote:
Hi,
I have serious concerns about this PEP, and would ask you to reconsider it. Hoping I'm not out of line in responding here, as PEP author. Some of your concerns (eg "5 days is too short") are clearly for Guido, not me, but perhaps I can respond to the rest of it. [ Very short summary: Generators are not the problem. It is the naive use of next() in an iterator that is the problem. (Note that all the examples involve calls to next()). Change next() rather than fiddling with generators. ] StopIteration is not a normal exception, indicating a problem, rather it exists to signal exhaustion of an iterator. However, next() raises StopIteration for an exhausted iterator, which really is an error. Any iterator code (generator or next method) that calls next() treats the StopIteration as a normal exception and propogates it. The controlling loop then interprets StopIteration as a signal to stop and thus stops. The problem is the implicit shift from signal to error and back to signal. The situation is this: Both next and next() need the capability to return literally any object at all. (I raised a hypothetical possibility of some sort of sentinel object, but for such a sentinel to be useful, it will need to have a name, which means that *by definition* that object would have to come up when iterating over the .values() of some namespace.) They both also need to be able to indicate a lack of return value. This means that either they return a (success, value) tuple, or they have some other means of signalling exhaustion.
You are grouping next() and it.next() together, but they are different. I think we agree that the next() method is part of the iterator protocol and should raise StopIteration. There is no fundamental reason why next(), the builtin function, should raise StopIteration, just because next(), the method, does. Many xxx() functions that wrap xxx() methods add additional functionality.
Consider max() or min(). Both of these methods take an iterable and if that iterable is empty they raise a ValueError. If next() did likewise then the original example that motivates this PEP would not be a problem.
I'm not sure what you mean by your "However" above. In both next and next(), this is a signal; it becomes an error as soon as you call next() and don't cope adequately with the signal, just as KeyError is an error.
2. The proposed solution does not address this issue at all, but rather legislates against generators raising StopIteration. Because that's the place where a StopIteration will cause a silent behavioral change, instead of cheerily bubbling up to top-level and printing a traceback. I must disagree. It is the FOR_ITER bytecode (implementing a loop or comprehension) that "silently" converts a StopIteration exception into a branch.
I think the generator's next() method handling of exceptions is correct; it propogates them, like most other code.
3. Generators and the iterator protocol were introduced in Python 2.2, 13 years ago. For all of that time the iterator protocol has been defined by the iter(), next()/next() methods and the use of StopIteration to terminate iteration.
Generators are a way to write iterators without the clunkiness of explicit iter() and next()/next() methods, but have always obeyed the same protocol as all other iterators. This has allowed code to rewritten from one form to the other whenever desired. Do not forget that despite the addition of the send() and throw() methods and their secondary role as coroutines, generators have primarily always been a clean and elegant way of writing iterators. This question has been raised several times; there is a distinct difference between iter() and next(), and it is only the I just mentioned iter as it is part of the protocol, I agree that next is relevant method. latter which is aware of StopIteration. Compare these three classes: class X: def init(self): self.state=0 def iter(self): return self def next(self): if self.state == 3: raise StopIteration self.state += 1 return self.state class Y: def iter(self): return iter([1,2,3]) class Z: def iter(self): yield 1 yield 2 yield 3 Note how just one of these classes uses StopIteration, and yet all three are iterable, yielding the same results. Neither Y nor Z is breaking iterator protocol - but neither of them is writing an iterator, either.
All three raise StopIteration, even if it is implicit. This is trivial to demonstrate:
def will_it_raise_stop_iteration(it): try: while True: it.next() except StopIteration: print("Raises StopIteration") except: print("Raises something else")
4. Porting from Python 2 to Python 3 seems to be hard enough already. Most of the code broken by this change can be fixed by a mechanical replacement of "raise StopIteration" with "return"; the rest need to be checked to see if they're buggy or unclear. There is an edge case with "return somevalue" vs "raise StopIteration(somevalue)" (the former's not compatible with 2.7), but apart from that, the recommended form of code for 3.7 will work in all versions of Python since 2.2. I think that when it comes to porting 2 to 3, the perception is more important than the technical difficultly. Sadly :(
5. I think I've already covered this in the other points, but to reiterate (excuse the pun): Calling next() on an exhausted iterator is, I would suggest, a logical error. How do you know that it's exhausted, other than by calling next() on it? Either we add a new method, or you have to handle the exception explicitly. But that is what you are trying to force anyway.
I probably should have said "Calling next(), without guarding against the possibility that the iterator is exhausted, is a logical error."
It also worth noting that calling next() is the only place a StopIteration exception is likely to occur outside of the iterator protocol. This I agree with. An example ----------
Consider a function to return the value from a set with a single member. def valuefromsingleton(s): if len(s) < 2: #Intentional error here (should be len(s) == 1) return next(iter(s)) raise ValueError("Not a singleton") Now suppose we pass an empty set to valuefromsingleton(s), then we get a StopIteration exception, which is a bit weird, but not too bad. Only a little weird - and no different from the way you'd get a TypeError if you pass it an integer. Except that TypeError is what is says, an error. StopIteration is a special not-really-an-error thing.
However it is when we use it in a generator (or in the next method of an iterator) that we get a serious problem. Currently the iterator appears to be exhausted early, which is wrong. However, with the proposed change we get RuntimeError("generator raised StopIteration") raised, which is also wrong, just in a different way. What you have here is two distinct issues. The first is "what happens if an unexpected StopIteration occurs during next processing?", and the second is "ditto ditto a generator's execution?". The first one is extremely hard to deal with, and extremely unlikely. The second is much easier to deal with, and can therefore be solved. I don't think there are two distinct issues. It is only the combination of the two that causes a real problem.
There are two places that StopIteration could be convert into a "real" exception. In the next() function or in the generator.next() method. Doing so in next() is, IMO, simpler and easier to understand and explain.
Solutions --------- My preferred "solution" is to do nothing except improving the documentation of next(). Explain that it can raise StopIteration which, if allowed to propogate can cause premature exhaustion of an iterator. Docs fixing doesn't solve everything. True, but docs fixing is always backwards compatible :) If something must be done then I would suggest changing the behaviour of next() for an exhausted iterator. Rather than raise StopIteration it should raise ValueError (or IndexError?). So, if I've understood you correctly, what you're saying is that next should raise StopIteration, and then next() should absorb that and raise ValueError instead? I'm not sure how this would help anything, but I can see that it would poke the issue with a sharp pointy stick. Can you elaborate on how this would work in practice? How would it help? It would prevent propagation of StopIteration causes premature exhaustion of an iterator. That is what the PEP is about, isn't it? Also, it might be worth considering making StopIteration inherit from BaseException, rather than Exception. Separate concern altogether, as the bases of StopIteration have nothing to do with a protocol meaning collision. I would probably support this change, on the basis that Exception should be for, well, exceptions, and BaseException can be used for everything that uses the exception-handling mechanism for other purposes. But it wouldn't help or affect this proposal. Agreed.
Cheers, Mark.
- Previous message: [Python-Dev] Please reconsider PEP 479.
- Next message: [Python-Dev] Please reconsider PEP 479.
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]