[Python-Dev] PEP 492: async/await in Python; v3 (original) (raw)

Yury Selivanov yselivanov.ml at gmail.com
Wed Apr 29 01:26:57 CEST 2015

Previous message (by thread): [Python-Dev] PEP 492: async/await in Python; v3
Next message (by thread): [Python-Dev] PEP 492: async/await in Python; v3
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi Guido,

Thank you for a very detailed review. Comments below:

On 2015-04-28 5:49 PM, Guido van Rossum wrote:

Inline comments below...

On Mon, Apr 27, 2015 at 8:07 PM, Yury Selivanov <yselivanov.ml at gmail.com> wrote:

Hi python-dev,

Another round of updates. Reference implementation has been updated: https://github.com/1st1/cpython/tree/await (includes all things from the below summary of updates + tests).

Summary: 1. "PyTypeObject.tpawait" slot. Replaces "tpreserved". This is to enable implementation of Futures with C API. Must return an iterator if implemented. That's fine (though I didn't follow this closely).

My main question here is it OK to reuse 'tp_reserved' (former tp_compare)?

I had to remove this check: https://github.com/1st1/cpython/commit/4be6d0a77688b63b917ad88f09d446ac3b7e2ce9#diff-c3cf251f16d5a03a9e7d4639f2d6f998L4906

On the other hand I think that it's a slightly better solution than adding a new slot.

2. New grammar for "await" expressions, see 'Syntax of "await" expression' section

I like it.

Great!

The current grammar requires parentheses for consequent await expressions:

await (await coro())

I can change this (in theory), but I kind of like the parens in this case -- better readability. And it'll be a very rare case.

3. inspect.iscoroutine() and inspect.iscoroutineobjects() functions.

What's the use case for these? I wonder if it makes more sense to have a check for a generalized awaitable rather than specifically a coroutine.

It's important to at least have 'iscoroutine' -- to check that the object is a coroutine function. A typical use-case would be a web framework that lets you to bind coroutines to specific http methods/paths:

 @http.get('/spam')
 async def handle_spam(request):
     ...

'http.get' decorator will need a way to raise an error if it's applied to a regular function (while the code is being imported, not in runtime).

The idea here is to cover all kinds of python objects in inspect module, it's Python's reflection API.

The other thing is that it's easy to implement this function for CPython: just check for CO_COROUTINE flag. For other Python implementations it might be a different story.

(More arguments for isawaitable() below)

4. Full separation of coroutines and generators. This is a big one; let's discuss.

a) Coroutine objects raise TypeError (is NotImplementedError better?) in their iter and next. Therefore it's not not possible to pass them to iter(), tuple(), next() and other similar functions that work with iterables. I think it should be TypeError -- what you really want is not to define these methods at all but given the implementation tactic for coroutines that may not be possible, so the nearest approximation is TypeError. (Also, NotImplementedError is typically to indicate that a subclass should implement it.)

Agree.

b) Because of (a), for..in iteration also does not work on coroutines anymore.

Sounds good. c) 'yield from' only accept coroutine objects from generators decorated with 'types.coroutine'. That means that existing asyncio generator-based coroutines will happily yield from both coroutines and generators. But every generator-based coroutine must be decorated with asyncio.coroutine(). This is potentially a backwards incompatible change. See below. I worry about backward compatibility. A lot. Are you saying that asycio-based code that doesn't use @coroutine will break in 3.5?

I'll experiment with replacing (c) with a warning.

We can disable iter and next for coroutines, but allow to use 'yield from' on them. Would it be a better approach?

d) inspect.isgenerator() and inspect.isgeneratorfunction() return False for coroutine objects & coroutine functions. Makes sense.

(d) can also break something (hypothetically). I'm not sure why would someone use isgenerator() and isgeneratorfunction() on generator-based coroutines in code based on asyncio, but there is a chance that someone did (it should be trivial to fix the code).

Same for iter() and next(). The chance is slim, but we may break some obscure code.

Are you OK with this?

e) Should we add a coroutine ABC (for cython etc)?

I, personally, think this is highly necessary. First, separation of coroutines from generators is extremely important. One day there won't be generator-based coroutines, and we want to avoid any kind of confusion. Second, we only can do this in 3.5. This kind of semantics change won't be ever possible. Sounds like Stefan agrees. Are you aware of http://bugs.python.org/issue24018 (Generator ABC)?

Yes, I saw the issue. I'll review it in more detail before thinking about Coroutine ABC for the next PEP update.

asyncio recommends using @coroutine decorator, and most projects that I've seen do use it. Also there is no reason for people to use iter() and next() functions on coroutines when writing asyncio code. I doubt that this will cause serious backwards compatibility problems (asyncio also has provisional status). I wouldn't count too much on asyncio's provisional status. What are the consequences for code that is written to work with asyncio but doesn't use @coroutine? Such code will work with 3.4 and (despite the provisional status and the recommendation to use @coroutine) I don't want that code to break in 3.5 (though maybe a warning would be fine). I also hope that if someone has their own (renamed) copy of asyncio that works with 3.4, it will all still work with 3.5. Even if asyncio itself is provisional, none of the primitives (e.g. yield from) that it is built upon are provisional, so there should be no reason for it to break in 3.5.

I agree. I'll try warnings for yield-fromming coroutines from regular generators (so that we can disable it in 3.7/3.6).

If that doesn't work, I think we need a compromise (not ideal, but breaking things is worse):

yield from would always accept coroutine-objects
iter(), next(), tuple(), etc won't work on coroutine-objects
for..in won't work on coroutine-objects

Thank you, Yury Some more inline comments directly on the PEP below. PEP: 492 Title: Coroutines with async and await syntax Version: RevisionRevisionRevision Last-Modified: DateDateDate Author: Yury Selivanov <yselivanov at sprymix.com> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 09-Apr-2015 Python-Version: 3.5 Post-History: 17-Apr-2015, 21-Apr-2015, 27-Apr-2015

Abstract ======== This PEP introduces new syntax for coroutines, asynchronous with statements and for loops. The main motivation behind this proposal is to streamline writing and maintaining asynchronous code, as well as to simplify previously hard to implement code patterns. Rationale and Goals =================== Current Python supports implementing coroutines via generators (PEP 342), further enhanced by the yield from syntax introduced in PEP 380. This approach has a number of shortcomings: * it is easy to confuse coroutines with regular generators, since they share the same syntax; async libraries often attempt to alleviate this by using decorators (e.g. @asyncio.coroutine [1]); * it is not possible to natively define a coroutine which has no yield or yield from statements, again requiring the use of decorators to fix potential refactoring issues; (I have to agree with Mark that this point is pretty weak. :-) * support for asynchronous calls is limited to expressions where yield is allowed syntactically, limiting the usefulness of syntactic features, such as with and for statements. This proposal makes coroutines a native Python language feature, and clearly separates them from generators. This removes generator/coroutine ambiguity, and makes it possible to reliably define coroutines without reliance on a specific library. This also enables linters and IDEs to improve static code analysis and refactoring. Native coroutines and the associated new syntax features make it possible to define context manager and iteration protocols in asynchronous terms. As shown later in this proposal, the new ``async with`` statement lets Python programs perform asynchronous calls when entering and exiting a runtime context, and the new async for statement makes it possible to perform asynchronous calls in iterators. I wonder if you could add some adaptation of the explanation I have posted (a few times now, I feel) for the reason why I prefer to suspend only at syntactically recognizable points (yield [from] in the past, await and async for/with in this PEP). Unless you already have it in the rationale (though it seems Mark didn't think it was enough :-).

I'll see what I can do.

Specification ============= This proposal introduces new syntax and semantics to enhance coroutine support in Python, it does not change the internal implementation of coroutines, which are still based on generators. It's actually a separate issue whether any implementation changes. Implementation changes don't need to go through the PEP process, unless they're really also interface changes. It is strongly suggested that the reader understands how coroutines are implemented in Python (PEP 342 and PEP 380). It is also recommended to read PEP 3156 (asyncio framework) and PEP 3152 (Cofunctions). From this point in this document we use the word coroutine to refer to functions declared using the new syntax. *generator-based coroutine* is used where necessary to refer to coroutines that are based on generator syntax. Despite reading this I still get confused when reading the PEP (probably because asyncio uses "coroutine" in the latter sense). Maybe it would make sense to write "native coroutine" for the new concept, to distinguish the two concepts more clearly? (You could even change "awaitable" to "coroutine". Though I like "awaitable" too.)

"awaitable" is a more generic term... It can be a future, or it can be a coroutine. Mixing them in one may create more confusion. Also, "awaitable" is more of an interface, or a trait, which means that the object won't be rejected by the 'await' expression.

I like your 'native coroutine' suggestion. I'll update the PEP.

New Coroutine Declaration Syntax -------------------------------- The following new syntax is used to declare a coroutine:: async def readdata(db): pass Key properties of coroutines: * async def functions are always coroutines, even if they do not contain await expressions. * It is a SyntaxError to have yield or yield from expressions in an async function. For Mark's benefit, might add that this is similar to how return and yield are disallowed syntactically outside functions (as are the syntactic constraints on await` and async for|def``).

* Internally, a new code object flag - COCOROUTINE - is introduced to enable runtime detection of coroutines (and migrating existing code). All coroutines have both COCOROUTINE and COGENERATOR flags set. * Regular generators, when called, return a generator object; similarly, coroutines return a coroutine object. * StopIteration exceptions are not propagated out of coroutines, and are replaced with a RuntimeError. For regular generators such behavior requires a future import (see PEP 479).

types.coroutine() ----------------- A new function coroutine(gen) is added to the types module. It applies COCOROUTINE flag to the passed generator-function's code object, making it to return a coroutine object when called. Clarify that this is a decorator that modifies the function object in place.

Good catch.

This feature enables an easy upgrade path for existing libraries.

Await Expression ---------------- The following new await expression is used to obtain a result of coroutine execution:: async def readdata(db): data = await db.fetch('SELECT ...') ... await, similarly to yield from, suspends execution of readdata coroutine until db.fetch awaitable completes and returns the result data. It uses the yield from implementation with an extra step of validating its argument. await only accepts an awaitable, which can be one of: * A coroutine object returned from a coroutine or a generator decorated with types.coroutine(). * An object with an _await_ method returning an iterator. Any yield from chain of calls ends with a yield. This is a fundamental mechanism of how Futures are implemented. Since, internally, coroutines are a special kind of generators, every await is suspended by a yield somewhere down the chain of await calls (please refer to PEP 3156 for a detailed explanation.) To enable this behavior for coroutines, a new magic method called _await_ is added. In asyncio, for instance, to enable Future objects in await statements, the only change is to add _await_ = _iter_ line to asyncio.Future class. Objects with _await_ method are called Future-like objects in the rest of this PEP. Also, please note that _aiter_ method (see its definition below) cannot be used for this purpose. It is a different protocol, and would be like using _iter_ instead of _call_ for regular callables. It is a TypeError if _await_ returns anything but an iterator. * Objects defined with CPython C API with a tpawait function, returning an iterator (similar to _await_ method). It is a SyntaxError to use await outside of a coroutine. It is a TypeError to pass anything other than an awaitable object to an await expression. Syntax of "await" expression '''''''''''''''''''''''''''' await keyword is defined differently from yield and ``yield from``. The main difference is that await expressions do not require parentheses around them most of the times. Examples:: ================================== ================================== Expression Will be parsed as ================================== ================================== if await fut: pass if (await fut): pass if await fut + 1: pass if (await fut) + 1: pass pair = await fut, 'spam' pair = (await fut), 'spam' with await fut, open(): pass with (await fut), open(): pass await foo()['spam'].baz()() await ( foo()['spam'].baz()() ) return await coro() return ( await coro() ) res = await coro() ** 2 res = (await coro()) ** 2 func(a1=await coro(), a2=0) func(a1=(await coro()), a2=0) ================================== ================================== See Grammar Updates section for details. Asynchronous Context Managers and "async with" ---------------------------------------------- An asynchronous context manager is a context manager that is able to suspend execution in its enter and exit methods. To make this possible, a new protocol for asynchronous context managers is proposed. Two new magic methods are added: _aenter_ and _aexit_. Both must return an awaitable. An example of an asynchronous context manager:: class AsyncContextManager: async def aenter(self): await log('entering context') async def aexit(self, exctype, exc, tb): await log('exiting context') New Syntax '''''''''' A new statement for asynchronous context managers is proposed:: async with EXPR as VAR: BLOCK which is semantically equivalent to:: mgr = (EXPR) aexit = type(mgr).aexit aenter = type(mgr).aenter(mgr) exc = True try: try: VAR = await aenter BLOCK except: exc = False exitres = await aexit(mgr, *sys.excinfo()) if not exitres: raise finally: if exc: await aexit(mgr, None, None, None) I realize you copied this from PEP 343, but I wonder, do we really need two nested try statements and the 'exc' flag? Why can't the finally step be placed in an 'else' clause on the inner try? (There may well be a reason but I can't figure what it is, and PEP 343 doesn't seem to explain it.) Also, it's a shame we're perpetuating the sys.excinfo() triple in the API here, but I agree making exit and aexit different also isn't a great idea. :-( PS. With the new tighter syntax for await you don't need the exitres variable any more.

Yes, this can be simplified. It was indeed copied from PEP 343.

As with regular with statements, it is possible to specify multiple context managers in a single async with statement. It is an error to pass a regular context manager without _aenter_ and _aexit_ methods to async with. It is a SyntaxError to use async with outside of a coroutine.

Example ''''''' With asynchronous context managers it is easy to implement proper database transaction managers for coroutines:: async def commit(session, data): ... async with session.transaction(): ... await session.update(data) ... Code that needs locking also looks lighter:: async with lock: ... instead of:: with (yield from lock): ... (Also, the implementation of the latter is problematic -- check asyncio/locks.py and notice that enter is empty...) Asynchronous Iterators and "async for" -------------------------------------- An asynchronous iterable is able to call asynchronous code in its iter implementation, and asynchronous iterator can call asynchronous code in its next method. To support asynchronous iteration: 1. An object must implement an _aiter_ method returning an awaitable resulting in an asynchronous iterator object. Have you considered making aiter not an awaitable? It's not strictly necessary I think, one could do all the awaiting in anext. Though perhaps there are use cases that are more naturally expressed by awaiting in aiter? (Your examples all use ``async def aiter(self): return self`` suggesting this would be no great loss.)

There is a section in Design Considerations about this. I should add a reference to it.

2. An asynchronous iterator object must implement an _anext_ method returning an awaitable. 3. To stop iteration _anext_ must raise a StopAsyncIteration exception. An example of asynchronous iterable:: class AsyncIterable: async def aiter(self): return self async def anext(self): data = await self.fetchdata() if data: return data else: raise StopAsyncIteration async def fetchdata(self): ...

New Syntax '''''''''' A new statement for iterating through asynchronous iterators is proposed:: async for TARGET in ITER: BLOCK else: BLOCK2 which is semantically equivalent to:: iter = (ITER) iter = await type(iter).aiter(iter) running = True while running: try: TARGET = await type(iter).anext(iter) except StopAsyncIteration: running = False else: BLOCK else: BLOCK2 It is a TypeError to pass a regular iterable without _aiter_ method to async for. It is a SyntaxError to use async for outside of a coroutine. As for with regular for statement, async for has an optional else clause. (Not because we're particularly fond of it, but because its absence would just introduce more special cases. :-) Example 1 ''''''''' With asynchronous iteration protocol it is possible to asynchronously buffer data during iteration:: async for data in cursor: ... Where cursor is an asynchronous iterator that prefetches N rows of data from a database after every N iterations. The following code illustrates new asynchronous iteration protocol:: class Cursor: def init(self): self.buffer = collections.deque() def prefetch(self): ... async def aiter(self): return self async def anext(self): if not self.buffer: self.buffer = await self.prefetch() if not self.buffer: raise StopAsyncIteration return self.buffer.popleft() then the Cursor class can be used as follows:: async for row in Cursor(): print(row) which would be equivalent to the following code:: i = await Cursor().aiter() while True: try: row = await i.anext() except StopAsyncIteration: break else: print(row) Example 2 ''''''''' The following is a utility class that transforms a regular iterable to an asynchronous one. While this is not a very useful thing to do, the code illustrates the relationship between regular and asynchronous iterators. :: class AsyncIteratorWrapper: def init(self, obj): self.it = iter(obj) async def aiter(self): return self async def anext(self): try: value = next(self.it) except StopIteration: raise StopAsyncIteration return value async for letter in AsyncIteratorWrapper("abc"): print(letter) Why StopAsyncIteration? ''''''''''''''''''''''' I keep wanting to propose to rename this to AsyncStopIteration. I know it's about stopping an async iteration, but in my head I keep referring to it as AsyncStopIteration, probably because in other places we use async (or 'a') as a prefix.

I'd be totally OK with that. Should I rename it?

Coroutines are still based on generators internally. So, before PEP 479, there was no fundamental difference between :: def g1(): yield from fut return 'spam' and :: def g2(): yield from fut raise StopIteration('spam') And since PEP 479 is accepted and enabled by default for coroutines, the following example will have its StopIteration wrapped into a RuntimeError :: async def a1(): await fut raise StopIteration('spam') The only way to tell the outside code that the iteration has ended is to raise something other than StopIteration. Therefore, a new built-in exception class StopAsyncIteration was added. Moreover, with semantics from PEP 479, all StopIteration exceptions raised in coroutines are wrapped in RuntimeError.

Debugging Features ------------------ One of the most frequent mistakes that people make when using generators as coroutines is forgetting to use yield from:: @asyncio.coroutine def useful(): asyncio.sleep(1) # this will do noting without 'yield from' Might be useful to point out that this was the one major advantage of PEP 3152 -- although it wasn't enough to save that PEP, and in your response you pointed out that this mistake is not all that common. Although you seem to disagree with that here ("One of the most frequent mistakes ...").

I think it's a mistake that a lot of beginners may make at some point (and in this sense it's frequent). I really doubt that once you were hit by it more than two times you would make it again.

This is a small wart, but we have to have a solution for it.

For debugging this kind of mistakes there is a special debug mode in asyncio, in which @coroutine decorator wraps all functions with a special object with a destructor logging a warning. Whenever a wrapped generator gets garbage collected, a detailed logging message is generated with information about where exactly the decorator function was defined, stack trace of where it was collected, etc. Wrapper object also provides a convenient _repr_ function with detailed information about the generator. The only problem is how to enable these debug capabilities. Since debug facilities should be a no-op in production mode, @coroutine decorator makes the decision of whether to wrap or not to wrap based on an OS environment variable PYTHONASYNCIODEBUG. This way it is possible to run asyncio programs with asyncio's own functions instrumented. EventLoop.setdebug, a different debug facility, has no impact on @coroutine decorator's behavior. With this proposal, coroutines is a native, distinct from generators, concept. New methods setcoroutinewrapper and getcoroutinewrapper are added to the sys module, with which frameworks can provide advanced debugging facilities. These two appear to be unspecified except by example.

Will add a subsection specifically for them.

It is also important to make coroutines as fast and efficient as possible, therefore there are no debug features enabled by default.

Example:: async def debugme(): await asyncio.sleep(1) def asyncdebugwrap(generator): return asyncio.CoroWrapper(generator) sys.setcoroutinewrapper(asyncdebugwrap) debugme() # <- this line will likely GC the coroutine object and # trigger asyncio.CoroWrapper's code. assert isinstance(debugme(), asyncio.CoroWrapper) sys.setcoroutinewrapper(None) # <- this unsets any # previously set wrapper assert not isinstance(debugme(), asyncio.CoroWrapper) If sys.setcoroutinewrapper() is called twice, the new wrapper replaces the previous wrapper. sys.setcoroutinewrapper(None) unsets the wrapper.

inspect.iscoroutine() and inspect.iscoroutineobject() ----------------------------------------------------- Two new functions are added to the inspect module: * inspect.iscoroutine(obj) returns True if obj is a coroutine object. * inspect.iscoroutinefunction(obj) returns True is obj is a coroutine function. Maybe isawaitable() and isawaitablefunction() are also useful? (Or only isawaitable()?)

I think that isawaitable would be really useful. Especially, to check if an object implemented with C API has a tp_await function.

isawaitablefunction() looks a bit confusing to me:

def foo(): return fut

is awaitable, but there is no way to detect that.

def foo(arg):
    if arg == 'spam':
        return fut

is awaitable sometimes.

Differences between coroutines and generators --------------------------------------------- A great effort has been made to make sure that coroutines and generators are separate concepts: 1. Coroutine objects do not implement _iter_ and _next_ methods. Therefore they cannot be iterated over or passed to iter(), list(), tuple() and other built-ins. They also cannot be used in a for..in loop. 2. yield from does not accept coroutine objects (unless it is used in a generator-based coroutine decorated with types.coroutine.) How does yield from know that it is occurring in a generator-based coroutine?

I check that in 'ceval.c' in the implementation of YIELD_FROM opcode.

If the current code object doesn't have a CO_COROUTINE flag and the opcode arg is a generator-object with CO_COROUTINE -- we raise an error.

3. yield from does not accept coroutine objects from plain Python generators (not generator-based coroutines.) I am worried about this. PEP 380 gives clear semantics to "yield from " and I don't think you can revert that here. Or maybe I am misunderstanding what you meant here? (What exactly are "coroutine objects from plain Python generators"?)

Not decorated with @coroutine

def some_algorithm_impl(): yield 1 yield from native_coroutine() # <- this is a bug

"some_algorithm_impl" is a regular generator. By mistake someone could try to use "yield from" on a native coroutine (which is 99.9% is a bug).

So we can rephrase it to:

``yield from`` does not accept *native coroutine objects*
from regular Python generators

I also agree that raising an exception in this case in 3.5 might break too much existing code. I'll try warnings, and if it doesn't work we might want to just let this restriction slip.

4. inspect.isgenerator() and inspect.isgeneratorfunction() return False for coroutine objects and coroutine functions.

Coroutine objects ----------------- Coroutines are based on generators internally, thus they share the implementation. Similarly to generator objects, coroutine objects have throw, send and close methods. StopIteration and GeneratorExit play the same role for coroutine objects (although PEP 479 is enabled by default for coroutines). Does send() make sense for a native coroutine? Check PEP 380. I think the only way to access the send() argument is by using yield but that's disallowed. Or is this about send() being passed to the yield that ultimately suspends the chain of coroutines? (You may just have to rewrite the section about that -- it seems a bit hidden now.)

Yes, 'send()' is needed to push values to the 'yield' statement somewhere (future) down the chain of coroutines (suspension point).

This has to be articulated in a clear way, I'll think how to rewrite this section without replicating PEP 380 and python documentation on generators.

Glossary ======== :Coroutine: A coroutine function, or just "coroutine", is declared with ``async def. It uses awaitandreturn value``; see New Coroutine_ _Declaration Syntax for details. :Coroutine object: Returned from a coroutine function. See Await Expression for details. :Future-like object: An object with an _await_ method, or a C object with tpawait function, returning an iterator. Can be consumed by an await expression in a coroutine. A coroutine waiting for a Future-like object is suspended until the Future-like object's _await_ completes, and returns the result. See Await_ _Expression for details. :Awaitable: A Future-like object or a coroutine object. See Await_ _Expression for details. :Generator-based coroutine: Coroutines based in generator syntax. Most common example is @asyncio.coroutine. :Asynchronous context manager: An asynchronous context manager has _aenter_ and _aexit_ methods and can be used with async with. See Asynchronous_ _Context Managers and "async with" for details. :Asynchronous iterable: An object with an _aiter_ method, which must return an asynchronous iterator object. Can be used with async for. See Asynchronous Iterators and "async for" for details. :Asynchronous iterator: An asynchronous iterator has an _anext_ method. See Asynchronous Iterators and "async for" for details.

List of functions and methods ============================= (I'm not sure of the utility of this section.)

It's a little bit hard to understand that "awaitable" is a general term that includes native coroutine objects, so it's OK to write both:

def __aenter__(): return fut
async def __aenter__(): ...

We (Victor and I) decided that it might be useful to have an additional section that explains it.

================= =================================== ================= Method Can contain Can't contain ================= =================================== ================= async def func await, return value yield, yield from async def a* await, return value yield, yield from (This line seems redundant.) def a* return awaitable await def await yield, yield from, return iterable await generator yield, yield from, return value await ================= =================================== =================

Where: * "async def func": coroutine; * "async def a*": _aiter_, _anext_, _aenter_, _aexit_ defined with the async keyword; * "def a*": _aiter_, _anext_, _aenter_, _aexit_ defined without the async keyword, must return an awaitable; * "def await": _await_ method to implement Future-like objects; * generator: a "regular" generator, function defined with def and which contains a least one yield or yield from expression.

Transition Plan =============== This may need to be pulled forward or at least mentioned earlier (in the Abstract or near the top of the Specification). To avoid backwards compatibility issues with async and await keywords, it was decided to modify tokenizer.c in such a way, that it: * recognizes async def name tokens combination (start of a coroutine); * keeps track of regular functions and coroutines; * replaces 'async' token with ASYNC and 'await' token with AWAIT when in the process of yielding tokens for coroutines. This approach allows for seamless combination of new syntax features (all of them available only in async functions) with any existing code. An example of having "async def" and "async" attribute in one piece of code:: class Spam: async = 42 async def ham(): print(getattr(Spam, 'async')) # The coroutine can be executed and will print '42' Backwards Compatibility ----------------------- This proposal preserves 100% backwards compatibility. Is this still true with the proposed restrictions on what yield from accepts? (Hopefully I'm the one who is confused. :-)

True for the code that uses @coroutine decorators properly. I'll see what I can do with warnings, but I'll update the section anyways.

Grammar Updates --------------- Grammar changes are also fairly minimal:: decorated: decorators (classdef | funcdef | asyncfuncdef) asyncfuncdef: ASYNC funcdef compoundstmt: (ifstmt | whilestmt | forstmt | trystmt | withstmt | funcdef | classdef | decorated | asyncstmt) asyncstmt: ASYNC (funcdef | withstmt | forstmt) power: atomexpr ['**' factor] atomexpr: [AWAIT] atom trailer*

Transition Period Shortcomings ------------------------------ There is just one.

Are you OK with this thing?

Until async and await are not proper keywords, it is not possible (or at least very hard) to fix tokenizer.c to recognize them on the same line with def keyword:: # async and await will always be parsed as variables async def outer(): # 1 def nested(a=(await fut)): pass async def foo(): return (await fut) # 2 Since await and async in such cases are parsed as NAME tokens, a SyntaxError will be raised. To workaround these issues, the above examples can be easily rewritten to a more readable form:: async def outer(): # 1 adefault = await fut def nested(a=adefault): pass async def foo(): # 2 return (await fut) This limitation will go away as soon as async and await ate proper keywords. Or if it's decided to use a future import for this PEP.

Deprecation Plans ----------------- async and await names will be softly deprecated in CPython 3.5 and 3.6. In 3.7 we will transform them to proper keywords. Making async and await proper keywords before 3.7 might make it harder for people to port their code to Python 3. asyncio ------- asyncio module was adapted and tested to work with coroutines and new statements. Backwards compatibility is 100% preserved. The required changes are mainly: 1. Modify @asyncio.coroutine decorator to use new types.coroutine() function. 2. Add _await_ = _iter_ line to asyncio.Future class. 3. Add ensuretask() as an alias for async() function. Deprecate async() function. Design Considerations ===================== PEP 3152 -------- PEP 3152 by Gregory Ewing proposes a different mechanism for coroutines (called "cofunctions"). Some key points: 1. A new keyword codef to declare a cofunction. Cofunction is always a generator, even if there is no cocall expressions inside it. Maps to async def in this proposal. 2. A new keyword cocall to call a cofunction. Can only be used inside a cofunction. Maps to await in this proposal (with some differences, see below.) 3. It is not possible to call a cofunction without a cocall keyword. 4. cocall grammatically requires parentheses after it:: atom: cocall | cocall: 'cocall' atom cotrailer* '(' [arglist] ')' cotrailer: '[' subscriptlist ']' | '.' NAME 5. cocall f(*args, **kwds) is semantically equivalent to yield from f._cocall_(*args, **kwds). Differences from this proposal: 1. There is no equivalent of _cocall_ in this PEP, which is called and its result is passed to yield from in the cocall expression. await keyword expects an awaitable object, validates the type, and executes yield from on it. Although, _await_ method is similar to _cocall_, but is only used to define Future-like objects. 2. await is defined in almost the same way as yield from in the grammar (it is later enforced that await can only be inside async def). It is possible to simply write await future, whereas cocall always requires parentheses. 3. To make asyncio work with PEP 3152 it would be required to modify @asyncio.coroutine decorator to wrap all functions in an object with a _cocall_ method, or to implement _cocall_ on generators. To call cofunctions from existing generator-based coroutines it would be required to use ``costart(cofunc, *args, **kwargs)`` built-in. 4. Since it is impossible to call a cofunction without a cocall keyword, it automatically prevents the common mistake of forgetting to use yield from on generator-based coroutines. This proposal addresses this problem with a different approach, see Debugging_ _Features. 5. A shortcoming of requiring a cocall keyword to call a coroutine is that if is decided to implement coroutine-generators -- coroutines with yield or async yield expressions -- we wouldn't need a cocall keyword to call them. So we'll end up having _cocall_ and no _call_ for regular coroutines, and having _call_ and no _cocall_ for coroutine- generators. 6. Requiring parentheses grammatically also introduces a whole lot of new problems. The following code:: await fut await functionreturningfuture() await asyncio.gather(coro1(arg1, arg2), coro2(arg1, arg2)) would look like:: cocall fut() # or cocall costart(fut) cocall (functionreturningfuture())() cocall asyncio.gather(costart(coro1, arg1, arg2), costart(coro2, arg1, arg2)) 7. There are no equivalents of async for and async with in PEP 3152. Coroutine-generators -------------------- With async for keyword it is desirable to have a concept of a coroutine-generator -- a coroutine with yield and yield from expressions. To avoid any ambiguity with regular generators, we would likely require to have an async keyword before yield, and async yield from would raise a StopAsyncIteration exception. While it is possible to implement coroutine-generators, we believe that they are out of scope of this proposal. It is an advanced concept that should be carefully considered and balanced, with a non-trivial changes in the implementation of current generator objects. This is a matter for a separate PEP. No implicit wrapping in Futures ------------------------------- There is a proposal to add similar mechanism to ECMAScript 7 [2]. A key difference is that JavaScript "async functions" always return a Promise. While this approach has some advantages, it also implies that a new Promise object is created on each "async function" invocation. We could implement a similar functionality in Python, by wrapping all coroutines in a Future object, but this has the following disadvantages: 1. Performance. A new Future object would be instantiated on each coroutine call. Moreover, this makes implementation of await expressions slower (disabling optimizations of yield from). 2. A new built-in Future object would need to be added. 3. Coming up with a generic Future interface that is usable for any use case in any framework is a very hard to solve problem. 4. It is not a feature that is used frequently, when most of the code is coroutines. Why "async" and "await" keywords -------------------------------- async/await is not a new concept in programming languages: * C# has it since long time ago [5]; * proposal to add async/await in ECMAScript 7 [2]; see also Traceur project [9]; * Facebook's Hack/HHVM [6]; * Google's Dart language [7]; * Scala [8]; * proposal to add async/await to C++ [10]; * and many other less popular languages. This is a huge benefit, as some users already have experience with async/await, and because it makes working with many languages in one project easier (Python with ECMAScript 7 for instance). Why "aiter" is a coroutine ------------------------------ In principle, _aiter_ could be a regular function. There are several good reasons to make it a coroutine: * as most of the _anext_, _aenter_, and _aexit_ methods are coroutines, users would often make a mistake defining it as async anyways; * there might be a need to run some asynchronous operations in _aiter_, for instance to prepare DB queries or do some file operation. Importance of "async" keyword ----------------------------- While it is possible to just implement await expression and treat all functions with at least one await as coroutines, this approach makes APIs design, code refactoring and its long time support harder. Let's pretend that Python only has await keyword:: def useful(): ... await log(...) ... def important(): await useful() If useful() function is refactored and someone removes all await expressions from it, it would become a regular python function, and all code that depends on it, including important() would be broken. To mitigate this issue a decorator similar to @asyncio.coroutine has to be introduced. Why "async def" --------------- For some people bare async name(): pass syntax might look more appealing than async def name(): pass. It is certainly easier to type. But on the other hand, it breaks the symmetry between ``async def, async withandasync for, where async`` is a modifier, stating that the statement is asynchronous. It is also more consistent with the existing grammar. Why "async for/with" instead of "await for/with" ------------------------------------------------ async is an adjective, and hence it is a better choice for a statement qualifier keyword. await for/with would imply that something is awaiting for a completion of a for or with statement. Why "async def" and not "def async" ----------------------------------- async keyword is a statement qualifier. A good analogy to it are "static", "public", "unsafe" keywords from other languages. "async for" is an asynchronous "for" statement, "async with" is an asynchronous "with" statement, "async def" is an asynchronous function. Having "async" after the main statement keyword might introduce some confusion, like "for async item in iterator" can be read as "for each asynchronous item in iterator". Having async keyword before def, with and for also makes the language grammar simpler. And "async def" better separates coroutines from regular functions visually. Why not a future import --------------------------- _future_ imports are inconvenient and easy to forget to add. Also, they are enabled for the whole source file. Consider that there is a big project with a popular module named "async.py". With future imports it is required to either import it using _import_() or importlib.importmodule() calls, or to rename the module. The proposed approach makes it possible to continue using old code and modules without a hassle, while coming up with a migration plan for future python versions. Why magic methods start with "a" -------------------------------- New asynchronous magic methods _aiter_, _anext_, _aenter_, and _aexit_ all start with the same prefix "a". An alternative proposal is to use "async" prefix, so that _aiter_ becomes _asynciter_. However, to align new magic methods with the existing ones, such as _radd_ and _iadd_ it was decided to use a shorter version. Why not reuse existing magic names ---------------------------------- An alternative idea about new asynchronous iterators and context managers was to reuse existing magic methods, by adding an async keyword to their declarations:: class CM: async def enter(self): # instead of aenter ... This approach has the following downsides: * it would not be possible to create an object that works in both with and async with statements; * it would break backwards compatibility, as nothing prohibits from returning a Future-like objects from _enter_ and/or _exit_ in Python <= 3.4;_ _* one of the main points of this proposal is to make coroutines as_ _simple and foolproof as possible, hence the clear separation of the_ _protocols._ _Why not reuse existing "for" and "with" statements_ _--------------------------------------------------_ _The vision behind existing generator-based coroutines and this proposal_ _is to make it easy for users to see where the code might be suspended._ _Making existing "for" and "with" statements to recognize asynchronous_ _iterators and context managers will inevitably create implicit suspend_ _points, making it harder to reason about the code._ _Comprehensions_ _--------------_ _For the sake of restricting the broadness of this PEP there is no new_ _syntax for asynchronous comprehensions. This should be considered in a_ _separate PEP, if there is a strong demand for this feature._ _Async lambdas_ _-------------_ _Lambda coroutines are not part of this proposal. In this proposal they_ _would look like async lambda(parameters): expression. Unless there_ _is a strong demand to have them as part of this proposal, it is_ _recommended to consider them later in a separate PEP._ _Performance_ _===========_ _Overall Impact_ _--------------_ _This proposal introduces no observable performance impact. Here is an_ _output of python's official set of benchmarks [4]:_ _::_ _python perf.py -r -b default ../cpython/python.exe_ _../cpython-aw/python.exe_ _[skipped]_ _Report on Darwin ysmac 14.3.0 Darwin Kernel Version 14.3.0:_ _Mon Mar 23 11:59:05 PDT 2015; root:xnu-2782.20.48~5/RELEASEX8664_ _x8664 i386_ _Total CPU cores: 8_ _### etreeiterparse ###_ _Min: 0.365359 -> 0.349168: 1.05x faster Avg: 0.396924 -> 0.379735: 1.05x faster Significant (t=9.71) Stddev: 0.01225 -> 0.01277: 1.0423x larger The following not significant results are hidden, use -v to show them: djangov2, 2to3, etreegenerate, etreeparse, etreeprocess, fastpickle, fastunpickle, jsondumpv2, jsonload, nbody, regexv8, tornadohttp. Tokenizer modifications ----------------------- There is no observable slowdown of parsing python files with the modified tokenizer: parsing of one 12Mb file (Lib/test/testbinop.py repeated 1000 times) takes the same amount of time. async/await ----------- The following micro-benchmark was used to determine performance difference between "async" functions and generators:: import sys import time def binary(n): if n <= 0: return 1 l = yield from binary(n - 1) r = yield from binary(n - 1) return l + 1 + r async def abinary(n): if n <= 0: return 1 l = await abinary(n - 1) r = await abinary(n - 1) return l + 1 + r def timeit(gen, depth, repeat): t0 = time.time() for in range(repeat): list(gen(depth)) t1 = time.time() print('{}({}) * {}: total {:.3f}s'.format( gen.name, depth, repeat, t1-t0)) The result is that there is no observable performance difference. Minimum timing of 3 runs :: abinary(19) * 30: total 12.985s binary(19) * 30: total 12.953s Note that depth of 19 means 1,048,575 calls. Reference Implementation ======================== The reference implementation can be found here: [3]. List of high-level changes and new protocols -------------------------------------------- 1. New syntax for defining coroutines: async def and new await keyword. 2. New _await_ method for Future-like objects, and new tpawait slot in PyTypeObject. 3. New syntax for asynchronous context managers: async with. And associated protocol with _aenter_ and _aexit_ methods. 4. New syntax for asynchronous iteration: async for. And associated protocol with _aiter_, _aexit_ and new built- in exception StopAsyncIteration. 5. New AST nodes: AsyncFunctionDef, AsyncFor, AsyncWith, Await. 6. New functions: sys.setcoroutinewrapper(callback), sys.getcoroutinewrapper(), types.coroutine(gen), inspect.iscoroutinefunction(), and inspect.iscoroutine(). 7. New COCOROUTINE bit flag for code objects. While the list of changes and new things is not short, it is important to understand, that most users will not use these features directly. It is intended to be used in frameworks and libraries to provide users with convenient to use and unambiguous APIs with async def, await, async for and async with syntax. Working example --------------- All concepts proposed in this PEP are implemented [3] and can be tested. :: import asyncio async def echoserver(): print('Serving on localhost:8000') await asyncio.startserver(handleconnection, 'localhost', 8000) async def handleconnection(reader, writer): print('New connection...') while True: data = await reader.read(8192) if not data: break print('Sending {:.10}... back'.format(repr(data))) writer.write(data) loop = asyncio.geteventloop() loop.rununtilcomplete(echoserver()) try: loop.runforever() finally: loop.close() References ========== .. [1] https://docs.python.org/3/library/asyncio-task.html#asyncio.coroutine .. [2] http://wiki.ecmascript.org/doku.php?id=strawman:asyncfunctions .. [3] https://github.com/1st1/cpython/tree/await .. [4] https://hg.python.org/benchmarks .. [5] https://msdn.microsoft.com/en-us/library/hh191443.aspx .. [6] http://docs.hhvm.com/manual/en/hack.async.php .. [7] https://www.dartlang.org/articles/await-async/ .. [8] http://docs.scala-lang.org/sips/pending/async.html .. [9] https://github.com/google/traceur-compiler/wiki/LanguageFeatures#async-functions-experimental .. [10] http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3722.pdf (PDF) Acknowledgments =============== I thank Guido van Rossum, Victor Stinner, Elvis Pranskevichus, Andrew Svetlov, and Łukasz Langa for their initial feedback. Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End:

Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org

Previous message (by thread): [Python-Dev] PEP 492: async/await in Python; v3
Next message (by thread): [Python-Dev] PEP 492: async/await in Python; v3
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list