Issue 24018: add a Generator ABC (original) (raw)

Created on 2015-04-20 19:36 by scoder, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (36)

msg241675 - (view)

Author: Stefan Behnel (scoder) * (Python committer)

Date: 2015-04-20 19:36

Currently, CPython tends to assume that generators are always implemented by a Python function that uses the "yield" keyword. However, it is absolutely possible to implement generators as a protocol by adding send(), throw() and close() methods to an iterator.

The problem is that these will usually be rejected by code that needs to distinguish generators from other input, e.g. in asyncio, as this code will commonly do a type check against types.GeneratorType. Instead, it should check for the expected protocol. The best way to achieve this is to extend the existing ABCs with a Generator ABC that external generator implementations can register on.

Related to issue 24004 (asyncio). Asyncio could provide a backport copy of this for older Python versions.

I assume this is considered a new feature that cannot be added to 3.4 anymore?

msg241677 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2015-04-20 19:40

It's missing tests. :-)

Otherwise looks quite sensible.

Also, shouldn't you override subclasshook so you don't inherit it from Iterator?

msg241681 - (view)

Author: Stefan Behnel (scoder) * (Python committer)

Date: 2015-04-20 19:58

Ok, sure. Here's a new patch that adds tests and docs.

msg241682 - (view)

Author: Stefan Behnel (scoder) * (Python committer)

Date: 2015-04-20 20:01

Sorry, here's another doc fix.

msg241683 - (view)

Author: Antoine Pitrou (pitrou) * (Python committer)

Date: 2015-04-20 20:11

Is it a problem that the check can't be done in a fast way from C code?

Other than that, sounds good to me.

msg241692 - (view)

Author: Raymond Hettinger (rhettinger) * (Python committer)

Date: 2015-04-21 02:21

For the other ABCs, if you define the required abstract methods, you get working versions of all the mixin methods.

In the case of the Generator ABC, throw() and close() are useless empty stub methods. In the other ABCs, we leave optional methods out entirely. A user should expect that if isinstance(g, Generator) is true that all of the ABC methods will work.

Also, the return StopIteration in throw() doesn't seem correct. Shouldn't it raise the exception that was thrown?

>>> def f():
        yield x

        
>>> g = f()
>>> g.throw(KeyError)

Traceback (most recent call last):
  File "<pyshell#11>", line 1, in <module>
    g.throw(KeyError)
  File "<pyshell#9>", line 1, in f
    def f():
KeyError

Ideally, there should be a practical example of where this ABC would be useful with anything other than a real generator.

msg241700 - (view)

Author: Stefan Behnel (scoder) * (Python committer)

Date: 2015-04-21 04:57

Here's a new patch that addresses the review comments. I kept throw() and close() non-abstract and added an example to the tests instead that implements a minimal generator by inheriting these methods from the Generator base class, using it as a mixin. It only needs to implement send().

I don't think it's unreasonable to assume that there are use cases where a generator that is implemented in a class instead of a yield-function does not require special cleanup actions in its close()/throw(). Or is it required that a generator stops working when close() is called?

There are also iterators that raise StopIteration at some point to signal that they are temporarily exhausted, but then eventually restart returning values from their next() method when they have received more to return. Avoids recreating the iterator object after each exhaustion cycle.

I extended the default implementation of close() to call throw(GeneratorExit), though, because that's how the protocol is defined. Unless users implement throw(), it won't make a difference for them. And if they do, they may even get away with not reimplementing close().

msg241701 - (view)

Author: Stefan Behnel (scoder) * (Python committer)

Date: 2015-04-21 05:09

Is it a problem that the check can't be done in a fast way from C code?

C code can still quickly special case the generator type in the positive case, and it will usually be interested in an exact type match anyway. Determining that an object is not (compatible with) a generator might lead to some degradation, but in most cases, at least in the interpreter core, this would be an error case anyway, so being slower here should rarely hurt.

msg241702 - (view)

Author: Stefan Behnel (scoder) * (Python committer)

Date: 2015-04-21 05:11

Next question: should inspect.isgenerator() be changed? Or should its usages be replaced by isinstance(obj, Generator) ?

msg241703 - (view)

Author: Martin Panter (martin.panter) * (Python committer)

Date: 2015-04-21 05:26

I agree that there is no big reason why we should force generators to stop working after close(). Your new default implementation of close() is probably the right thing too. I added a few new comments on Reitveld.

msg241704 - (view)

Author: Stefan Behnel (scoder) * (Python committer)

Date: 2015-04-21 06:10

Good catch with the RuntimeError. Patch updated.

msg241705 - (view)

Author: Stefan Behnel (scoder) * (Python committer)

Date: 2015-04-21 06:35

should inspect.isgenerator() be changed?

Replying to myself, one thing that speaks against this approach is that "inspect" also has the functions "getgeneratorlocals()" and "getgeneratorstate()", which depend on "gen.gi_frame". Cython could emulate that, but normal user code usually can't. It's definitely not part of the Generator protocol but an implementation detail of GeneratorType.

msg241716 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2015-04-21 15:19

Is it an option to leave inspect alone? Its definition and use of generators is very focused on the builtin implementation.

Although this means that code that wants to do the right thing but also be backwards compatible has to use something like

def isgen(g):
    if hasattr(collections.abc, 'Generator'):
        return isinstance(c, collections.abc.Generator)
    else:
        return inspect.isgenerator(g)

msg241717 - (view)

Author: Stefan Behnel (scoder) * (Python committer)

Date: 2015-04-21 15:44

Yes, code usually doesn't fall from the sky with a new CPython release. If something wants to make use of this ABC, it still has to find ways to also work with older CPythons. What would be a good fallback? A backport on PyPI?

I wouldn't even mind shipping this class with Cython and dropping it right into "collections.abc" if it's missing. Cython already contains lots of compatibility code, and we need to patch something in all current CPythons either way. Then your code snippet would be the right thing to do for user code (with the twist that Py2.x doesn't have "collections.abc"...).

msg241723 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2015-04-21 16:10

Realistically only Cython will care much about this, so I'm okay if Cython just monkeypatches the collections package.

msg241725 - (view)

Author: Antoine Pitrou (pitrou) * (Python committer)

Date: 2015-04-21 16:49

In some distant future, Numba might also care too :) (right now, we only support compiling basic generators, i.e. no send() and no throw())

msg241736 - (view)

Author: Stefan Behnel (scoder) * (Python committer)

Date: 2015-04-21 19:39

Yes, and there are certainly other compilers and tools. However, it's going to be a short list, so if Py3.5 takes the lead, they can all just agree on the one way to do it and let the first patcher win.

msg241846 - (view)

Author: Stefan Behnel (scoder) * (Python committer)

Date: 2015-04-23 06:32

Adding a patch for the inspect docs that refers to the Right Way To Do It.

msg241956 - (view)

Author: Stefan Behnel (scoder) * (Python committer)

Date: 2015-04-24 16:54

Searching for Python code that seems to implement the Generator protocol doesn't return much:

https://code.openhub.net/search?s=%22def%20throw%28%22%20%22def%20send%28%22%20%22def%20close%28%22&pp=0&fl=Python&mp=1&ml=1&me=1&md=1&ff=1&filterChecked=true

But at least it pointed me to this:

https://hg.python.org/cpython/file/5c0247a6f98a/Lib/multiprocessing/managers.py#l922

So, wrapping generators (apparently for remote calls in this case) seems to be another use case.

msg241993 - (view)

Author: Stefan Behnel (scoder) * (Python committer)

Date: 2015-04-25 06:03

FYI, here's the patch implementation for Cython:

https://github.com/cython/cython/blob/cf63ff71c06b16c3a30facdc7859743f4cd495f6/Cython/Utility/Generator.c#L849

The only difference is that it takes care of changing "next" to "next" in Py2.x.

msg242054 - (view)

Author: Ludovic Gasc (Ludovic.Gasc) *

Date: 2015-04-26 12:58

Sorry guys to be basic for you, but if I take my "AsyncIO end-user" hat, I'm not sure to understand the potential end-user source code impacts to use Cython with Python 3.5 and AsyncIO.

In concrete terms, it's only a low-level change, Cython will monkeypatch CPython if it's missing. I can continue to use asyncio.iscoroutine() function to detect coroutines. Or it should be better to change something in AsyncIO libs and/or end-user source code ?

With the potential async/await inclusion in Python 3.5, it should be good to know if something else is necessary to help for the Cython support.

msg242071 - (view)

Author: Raymond Hettinger (rhettinger) * (Python committer)

Date: 2015-04-26 19:37

In the lastest patch, the close() method is now a valid mixin method.

However, the throw() method should be made abstract because it doesn't provide the required operation (it doesn't even use the "self" argument) or it should be left out entirely (i.e. Numba only supporting next()) if the presence of throw() is not required.

Going back to Stefan's original use case, the problem being solved is that isinstance(g, types.GeneratorType) rejects regular classes that implement send(), next(), throw(), and close(), presumably because it is going to call those methods and expect that they work.

If an object tests positive for isinstance(g, collections.abc.Generator), what can we then assume about the object. It would be weird to pass that test, see a throw() method, call it and have it do something other than raise an exception inside the generator.

g = lazy_readline_from_connection('171.0.0.1')
if isinstance(g, collections.abc.Generator):
    # We passed the isinstance check, now what does that
    # actually mean?  What is guaranteed to work?
    next(g)
    g.close()    # Expect this to close the connection

If a working throw() isn't actually required, then the code ought to be checking for isinstance(obj, collections.abc.Iterator) or somesuch; otherwise, was is the point of doing any checks for a generator-like object?

I don't think this patch should go in until it is capable of doing something meaningful. Otherwise, it looks like at attempt to bluff its way past generator checks that were presumably there for a reason.

msg242072 - (view)

Author: Antoine Pitrou (pitrou) * (Python committer)

Date: 2015-04-26 19:40

I think the throw() method should be required. If you don't need throw(), send() or close(), then you aren't really asking for a full-blown generator: you are asking for an iterator, so you can just check for collections.Iterator.

(PS: why is this bug assigned to Lukasz?)

msg242077 - (view)

Author: Stefan Behnel (scoder) * (Python committer)

Date: 2015-04-26 20:32

PEP 342 isn't really conclusive here as it intended to define the protocol based on the de-facto design of a yield-based generator function. Trying to abstract from that poses the question how a class based generator implementation should look like. Specifically, it is not clear in this case what "raise an exception inside the generator" even means.

As Antoine pointed out, there definitely has to be more than an Iterator. However, it's not clear that throw() and close() are always required for an implementation. Simple Generators might get away without special error handling and cleanup.

Similarly, a Generator may not allow (or expect) values to be passed in and only make use of close() for cleanup (and potentially also throw()).

So, there seem to be at least three use cases for the Generator protocol here:

  1. an Iterator that can accept input values via send()
  2. an Iterator that needs to safely do certain cleanup operations after use (throw/close)
  3. a complete Generator that accepts input values and cleans up after it

Given that there used to be only one implementation, I think we can assume that a lot of code depends on the complete protocol. Some recipients will certainly be aware of the exact subset of the protocol that the Generator they have in their hands makes use of, but if they don't, they'll just have to assume it supports everything and requires proper cleanup on errors and on termination.

Therefore, I think it's important to cover the complete protocol in the Generator ABC. I also think it's helpful to not require users to override throw() in a subclass, as they might not need it. "Throwing an exception inside of them" might simply not have a meaning for them. If they do need it, however, then the current implementation might help them to properly raise the exception, given that the 3-argument raise statement is no longer available in Py3.

Both reasons (whether you need throw() or not) make me think that it should not be abstract. The same applies to close(), which has a meaningful implementation now that subclasses can make use of in order to avoid having to implement both close() and throw().

And yes, this is about sneaking past generator type checks because most of them are actually not there for a reason. Asyncio is just one example.

msg242087 - (view)

Author: Raymond Hettinger (rhettinger) * (Python committer)

Date: 2015-04-27 00:07

Therefore, I think it's important to cover the complete protocol in the Generator ABC. I also think it's helpful to not require users to override throw() in a subclass, as they might not need it.

Sorry, but I think you're fighting the fundament nature of what the ABCs are supposed to do and I object to the patch going in as-is.

Please either

  1. drop the throw() method entirely or
  2. make throw an abstractmethod()

If it is an abstractmethod, the user is free to implement a throw() method that raises a NotImplementedError, but at least they will do so consciously rather than having it come-up expectedly.

When I teach engineers how to use the collections ABCs, they rely on the rules:

Until now, those rules were followed by all the ABCs. I really don't want to break the basic contract of what the ABC is supposed to mean.

msg242094 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2015-04-27 03:43

I'm with Raymond.

msg242102 - (view)

Author: Stefan Behnel (scoder) * (Python committer)

Date: 2015-04-27 06:26

Please either

  1. drop the throw() method entirely or
  2. make throw an abstractmethod()

Ok, as I already said, I think it's important to provide the complete protocol as code will usually expect that. Also, close() has a helpful implementation, but it depends on throw(). Thus, I'll go with 2) and make throw() abstract. It's easy enough to call super(), but then it's explicit. I agree that that's better here.

Patch updated.

msg242104 - (view)

Author: Raymond Hettinger (rhettinger) * (Python committer)

Date: 2015-04-27 07:33

The latest patch looks good overall.

Łukasz, assigning back to you.

msg242445 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2015-05-03 01:51

Yeah, looks good -- Łuke, can you commit this?

msg242612 - (view)

Author: Łukasz Langa (lukasz.langa) * (Python committer)

Date: 2015-05-05 19:12

Yup, will do.

msg242794 - (view)

Author: Stefan Behnel (scoder) * (Python committer)

Date: 2015-05-09 03:34

This is blocking issue 24017 (async/await syntax).

msg242796 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2015-05-09 03:47

Ask Yury if he'll commit it for you. It's ready.

msg242797 - (view)

Author: Roundup Robot (python-dev) (Python triager)

Date: 2015-05-09 05:07

New changeset ba5d7041e2f5 by Raymond Hettinger in branch 'default': Issue #24018: Add a collections.Generator abstract base class. https://hg.python.org/cpython/rev/ba5d7041e2f5

msg242798 - (view)

Author: Stefan Behnel (scoder) * (Python committer)

Date: 2015-05-09 05:28

Thanks! Minor grouch: it should say "collections.abc.Generator" in the NEWS entry.

msg242826 - (view)

Author: Roundup Robot (python-dev) (Python triager)

Date: 2015-05-09 18:37

New changeset f7cc54086cd2 by Guido van Rossum in branch 'default': Fix news entry for issue 24018. https://hg.python.org/cpython/rev/f7cc54086cd2

msg242827 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2015-05-09 18:38

Fixed.

On Fri, May 8, 2015 at 10:28 PM, Stefan Behnel <report@bugs.python.org> wrote:

Stefan Behnel added the comment:

Thanks! Minor grouch: it should say "collections.abc.Generator" in the NEWS entry.



Python tracker <report@bugs.python.org> <http://bugs.python.org/issue24018>


History

Date

User

Action

Args

2022-04-11 14:58:15

admin

set

github: 68206

2015-05-10 21:25:37

yselivanov

link

issue24017 dependencies

2015-05-09 18:38:11

gvanrossum

set

messages: +

2015-05-09 18:37:45

python-dev

set

messages: +

2015-05-09 05:28:47

scoder

set

messages: +

2015-05-09 05:08:15

rhettinger

set

status: open -> closed
resolution: fixed
stage: resolved

2015-05-09 05:07:32

python-dev

set

nosy: + python-dev
messages: +

2015-05-09 04:52:56

rhettinger

set

assignee: lukasz.langa -> rhettinger

2015-05-09 03:47:38

gvanrossum

set

messages: +

2015-05-09 03:34:18

scoder

set

messages: +

2015-05-05 19:12:23

lukasz.langa

set

messages: +

2015-05-03 01:51:38

gvanrossum

set

messages: +

2015-04-27 07:33:57

rhettinger

set

assignee: rhettinger -> lukasz.langa
messages: +

2015-04-27 06:26:11

scoder

set

files: + generator_abc.patch

messages: +

2015-04-27 03:43:59

gvanrossum

set

messages: +

2015-04-27 00:07:06

rhettinger

set

assignee: lukasz.langa -> rhettinger
messages: +

2015-04-26 20:32:06

scoder

set

messages: +

2015-04-26 19:40:43

pitrou

set

messages: +

2015-04-26 19:37:49

rhettinger

set

messages: +

2015-04-26 12:58:41

Ludovic.Gasc

set

nosy: + Ludovic.Gasc
messages: +

2015-04-25 06:03:07

scoder

set

messages: +

2015-04-24 16:54:35

scoder

set

messages: +

2015-04-23 06:37:11

scoder

set

files: + inspect_docs.patch

2015-04-23 06:36:51

scoder

set

files: - inspect_docs.patch

2015-04-23 06:32:33

scoder

set

files: + inspect_docs.patch

messages: +

2015-04-21 19:39:16

scoder

set

messages: +

2015-04-21 16:49:09

pitrou

set

messages: +

2015-04-21 16:10:33

gvanrossum

set

messages: +

2015-04-21 15:44:02

scoder

set

messages: +

2015-04-21 15:19:31

gvanrossum

set

messages: +

2015-04-21 06:35:10

scoder

set

messages: +

2015-04-21 06:10:31

scoder

set

files: + generator_abc.patch

messages: +

2015-04-21 05:26:02

martin.panter

set

messages: +

2015-04-21 05:11:02

scoder

set

messages: +

2015-04-21 05:09:06

scoder

set

messages: +

2015-04-21 04:57:55

scoder

set

files: + generator_abc.patch

messages: +

2015-04-21 02:21:57

rhettinger

set

nosy: + rhettinger
messages: +

2015-04-20 23:14:44

martin.panter

set

nosy: + martin.panter

2015-04-20 20:45:29

lukasz.langa

set

assignee: lukasz.langa

nosy: + lukasz.langa

2015-04-20 20:11:22

pitrou

set

nosy: + pitrou
messages: +

2015-04-20 20:01:44

scoder

set

files: + generator_abc.patch

messages: +

2015-04-20 20:01:15

scoder

set

files: - generator_abc.patch

2015-04-20 19:58:05

scoder

set

files: + generator_abc.patch

messages: +

2015-04-20 19:57:12

scoder

set

files: - generator_abc.patch

2015-04-20 19:40:48

gvanrossum

set

messages: +

2015-04-20 19:36:25

scoder

create