[Python-Dev] PEP 492: What is the real goal? (original) (raw)

Yury Selivanov yselivanov.ml at gmail.com
Wed Apr 29 20:06:23 CEST 2015


Hi Jim,

On 2015-04-29 1:43 PM, Jim J. Jewett wrote:

On Tue Apr 28 23:49:56 CEST 2015, Guido van Rossum quoted PEP 492:

Rationale and Goals ===================

Current Python supports implementing coroutines via generators (PEP 342), further enhanced by the yield from syntax introduced in PEP 380. This approach has a number of shortcomings: * it is easy to confuse coroutines with regular generators, since they share the same syntax; async libraries often attempt to alleviate this by using decorators (e.g. @asyncio.coroutine [1]); So? PEP 492 never says what coroutines are in a way that explains why it matters that they are different from generators. Do you really mean "coroutines that can be suspended while they wait for something slow"? As best I can guess, the difference seems to be that a "normal" generator is using yield primarily to say: "I'm not done; I have more values when you want them", but an asynchronous (PEP492) coroutine is primarily saying: "This might take a while, go ahead and do something else meanwhile."

Correct.

As shown later in this proposal, the new ``async with`` statement lets Python programs perform asynchronous calls when entering and exiting a runtime context, and the new async for statement makes it possible to perform asynchronous calls in iterators. Does it really permit making them, or does it just signal that you will be waiting for them to finish processing anyhow, and it doesn't need to be a busy-wait?

I does.

As nearly as I can tell, "async with" doesn't start processing the managed block until the "asynchronous" call finishes its work -- the only point of the async is to signal a scheduler that the task is blocked.

Right.

Similarly, "async for" is still linearized, with each step waiting until the previous "asynchronous" step was not merely launched, but fully processed. If anything, it prevents within-task parallelism.

It enables cooperative parallelism.

It uses the yield from implementation with an extra step of validating its argument. await only accepts an awaitable, which can be one of: What justifies this limitation?

We want to avoid people passing regular generators and random objects to 'await', because it is a bug.

Is there anything wrong awaiting something that eventually uses "return" instead of "yield", if the "this might take a while" signal is still true?

If it's an 'async def' then sure, you can use it in await.

Is the problem just that the current implementation might not take proper advantage of task-switching?

Objects with _await_ method are called Future-like objects in the rest of this PEP.

Also, please note that _aiter_ method (see its definition below) cannot be used for this purpose. It is a different protocol, and would be like using _iter_ instead of _call_ for regular callables. It is a TypeError if _await_ returns anything but an iterator. What would be wrong if a class just did await = anext ? If the problem is that the result of await should be iterable, then why isn't await = aiter OK?

For coroutines in PEP 492:

await = anext is the same as call = next await = aiter is the same as call = iter

await keyword is defined differently from yield and ``yield from``. The main difference is that await expressions do not require parentheses around them most of the times. Does that mean "The await keyword has slightly higher precedence than yield, so that fewer expressions require parentheses"? class AsyncContextManager: async def aenter(self): await log('entering context') Other than the arbitrary "keyword must be there" limitations imposed by this PEP, how is that different from: class AsyncContextManager: async def aenter(self): log('entering context')

This is OK. The point is that you can use 'await log' in aenter. If you don't need awaits in aenter you can use them in aexit. If you don't need them there too, then just define a regular context manager.

or even: class AsyncContextManager: def aenter(self): log('entering context') Will anything different happen when calling aenter or log? Is it that log itself now has more freedom to let other tasks run in the middle?

aenter must return an awaitable.

It is an error to pass a regular context manager without _aenter_ and _aexit_ methods to async with. It is a SyntaxError to use async with outside of a coroutine. Why? Does that just mean they won't take advantage of the freedom you offered them?

Not sure I understand the question.

It doesn't make any sense in using 'async with' outside of a coroutine. The interpeter won't know what to do with them: you need an event loop for that.

Or are you concerned that they are more likely to cooperate badly with the scheduler in practice?

It is a TypeError to pass a regular iterable without _aiter_ method to async for. It is a SyntaxError to use async for outside of a coroutine. The same questions about why -- what is the harm? The following code illustrates new asynchronous iteration protocol::

class Cursor: def init(self): self.buffer = collections.deque() def prefetch(self): ... async def aiter(self): return self async def anext(self): if not self.buffer: self.buffer = await self.prefetch() if not self.buffer: raise StopAsyncIteration return self.buffer.popleft() then the Cursor class can be used as follows:: async for row in Cursor(): print(row) Again, I don't see what this buys you except that a scheduler has been signaled that it is OK to pre-empt between rows. That is worth signaling, but I don't see why a regular iterator should be forbidden.

It's not about signaling. It's about allowing cooperative scheduling of long-running processes.

For debugging this kind of mistakes there is a special debug mode in asyncio, in which @coroutine decorator wraps all functions with a special object with a destructor logging a warning. ... The only problem is how to enable these debug capabilities. Since debug facilities should be a no-op in production mode, @coroutine decorator makes the decision of whether to wrap or not to wrap based on an OS environment variable PYTHONASYNCIODEBUG. So the decision is made at compile-time, and can't be turned on later? Then what is wrong with just offering an alternative @coroutine that can be used to override the builtin? Or why not just rely on setcoroutinewrapper entirely, and simply set it to None (so no wasted wrappings) by default?

It is set to None by default. Will clarify that in the PEP.

Thanks, Yury



More information about the Python-Dev mailing list