[Python-Dev] Return from generators in Python 3.2 (original) (raw)

Yury Selivanov yselivanov at gmail.com
Fri Aug 27 04:40:33 CEST 2010


On 2010-08-26, at 8:25 PM, Guido van Rossum wrote:

On Thu, Aug 26, 2010 at 5:05 PM, Yury Selivanov <yselivanov at gmail.com> wrote:

On 2010-08-26, at 8:04 PM, Greg Ewing wrote:

Even with your proposal, you'd still have to use a 'creepy abstraction' every time one of your coroutines calls another. That's why PEP 380 deals with 'more than just return'.

Nope. In almost any coroutine framework you have a scheduler or trampoline object that basically does all the work of calling, passing values and propagating exceptions. And many other things that 'yield from' won't help you with (cooperation, deferring to process/thread pools, pausing, etc.) Being a developer of one of such frameworks, I can tell you, that I can easily live without 'yield from', but dealing with weird return syntax is a pain. That's not my experience. I wrote a trampoline myself (not released yet), and found that I had to write a lot more code to deal with the absence of yield-from than to deal with returns. In my framework, users write 'raise Return(value)' where Return is a subclass of StopIteration. The trampoline code that must be written to deal with StopIteration can be extended trivially to deal with this. The only reason I chose to use a subclass is so that I can diagnose when the return value is not used, but I could have chosen to ignore this or just diagnose whenever the argument to StopIteration is not None.

In the framework I'm talking about (not yet released too, but I do plan to open source it one day), everything that can be yielded is an instance of a special class - Command. Generators are wrapped with a subclass of Command - Task, socket methods return Recv & Send commands etc. Regular python functions, by the way, can be also wrapped in a Task command, - the framework is smart enough to manage them automatically. Of course, wrapping of python functions is abstracted in decorators, so be it a simple @coroutine, or some asynchronous @bus.method - they all are native objects to the scheduler. And all the work of exception propagation, IO waiting, managing of timeouts and much more is a business framework (without 'yield from'.)

Hence the yield statement is nothing more than a point of a code flow where some command is pushed to the scheduler for execution. And can be inserted almost everywhere.

This approach differs from the one you showed in PEP 342; it's much more complicated, but is has its own strong advantages. It is not new though, for instance, almost the same idea is utilized in 'cogen' framework, and few others (can't remember all names but I did quite a big research before writing a single line of code.) All those frameworks are suffer from the inability of using native return statement in generators.

Now, imagine a big project. I mean really big complicated system, with tens of thousands lines of code. Code is broken down to small methods: some of them implement some asynchronous methods on a message bus, some of them are mapped to serve responses on specific URLs and so on. In the way of writing code I'm talking about, there is no distinction between coroutines and subroutines. There are some methods which just return some value; some that query a potentially blocking code with 'yield' keyword and after that they return the result - it all doesn't matter. Abstraction is very good and simple, 'yield' statement just marks suspension points, and thats all. BUT - there is a 'return problem', so if the code got a new yield statement - you have to go and fix all returns and vice versa. It just breaks the beauty of the language. I've invested tons of time into it, and suffer from the weird syntax that differs from one line to another.

Of course I can live with that, and people that developed other frameworks will too. But considering that the 'return' syntax is almost approved (1); almost one hundred percent it will be merged to 3.3 (2); the change is small and backwards compatible (3); one-two hours of work to port to other interpreters - so not contradict 100% with the moratorium ideas (4)

The asynchronous programming is booming now. It gets more and more attention day by day. And python has a unique combination of features that may make it one of the leaders in the field (nodejs is amateur; erlang is hard; java, ruby and family lacks 'yield' statement so you have to use callbacks - and that's ugly.) Wait for this simple feature for several years in a world that is changing that fast? I'm not sure.

Probably the last point - this would be one more good advantage of py3k for python 2.x users.

Sorry for such a long text, I just wanted to make my points clear and provide some examples.

Especially when you use decorators like @bus.method, or @protocol.handler, that transparently wrap your callable be it generator or regular function. And after that you have to use different return syntax for them. Until PEP 380 is implemented, you have to use different return syntax in generators. You have some choices: raise StopIteration(value), raise SomethingElse(value), or callSomeFunction(value) -- where callSomeFunction raises the exception. I like the raise variants because they signal to tools that the flow control stops here -- e.g. in Emacs, python-mode.el automatically dedents after a 'raise' or 'return' but not after a call (of course).

I'm not asking for the whole PEP380, but for a small subset of it. So if it's not that much contradicts with moratorium - let's discuss the feature. If it is - then OK, I stop spamming ;-)



More information about the Python-Dev mailing list