[Python-Dev] PEP 525, third round, better finalization (original) (raw)
Yury Selivanov yselivanov.ml at gmail.com
Thu Sep 1 18:34:06 EDT 2016
- Previous message (by thread): [Python-Dev] PEP 529: Change Windows filesystem encoding to UTF-8
- Next message (by thread): [Python-Dev] PEP 525, third round, better finalization
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi,
I've spent quite a while thinking and experimenting with PEP 525 trying to figure out how to make asynchronous generators (AG) finalization reliable. I've tried to replace the callback for GCed with a callback to intercept first iteration of AGs. Turns out it's very hard to work with weak-refs and make asyncio event loop to reliably track and shutdown all open AGs.
My new approach is to replace the "sys.set_asyncgen_finalizer(finalizer)" function with "sys.set_asyncgen_hooks(firstiter=None, finalizer=None)".
This design allows us to:
intercept first iteration of an AG. That makes it possible for event loops to keep a weak set of all "open" AGs, and to implement a "shutdown" method to close the loop and close all AGs reliably.
intercept AGs GC. That makes it possible to call "aclose" on GCed AGs to guarantee that 'finally' and 'async with' statements are properly closed.
in later Python versions we can add more hooks, although I can't think of anything else we need to add right now.
I'm posting below the only updated PEP section. The latest PEP revision should also be available on python.org shortly.
All new proposed changes are available to play with in my fork of CPython here: https://github.com/1st1/cpython/tree/async_gen
Finalization
PEP 492 requires an event loop or a scheduler to run coroutines. Because asynchronous generators are meant to be used from coroutines, they also require an event loop to run and finalize them.
Asynchronous generators can have try..finally
blocks, as well as
async with
. It is important to provide a guarantee that, even
when partially iterated, and then garbage collected, generators can
be safely finalized. For example::
async def square_series(con, to):
async with con.transaction():
cursor = con.cursor(
'SELECT generate_series(0, $1) AS i', to)
async for row in cursor:
yield row['i'] ** 2
async for i in square_series(con, 1000):
if i == 100:
break
The above code defines an asynchronous generator that uses
async with
to iterate over a database cursor in a transaction.
The generator is then iterated over with async for
, which interrupts
the iteration at some point.
The square_series()
generator will then be garbage collected,
and without a mechanism to asynchronously close the generator, Python
interpreter would not be able to do anything.
To solve this problem we propose to do the following:
Implement an
aclose
method on asynchronous generators returning a special awaitable. When awaited it throws aGeneratorExit
into the suspended generator and iterates over it until either aGeneratorExit
or aStopAsyncIteration
occur.This is very similar to what the
close()
method does to regular Python generators, except that an event loop is required to executeaclose()
.Raise a
RuntimeError
, when an asynchronous generator executes ayield
expression in itsfinally
block (usingawait
is fine, though)::async def gen(): try: yield finally: await asyncio.sleep(1) # Can use 'await'. yield # Cannot use 'yield', # this line will trigger a # RuntimeError.
Add two new methods to the
sys
module:set_asyncgen_hooks()
andget_asyncgen_hooks()
.
The idea behind sys.set_asyncgen_hooks()
is to allow event
loops to intercept asynchronous generators iteration and finalization,
so that the end user does not need to care about the finalization
problem, and everything just works.
sys.set_asyncgen_hooks()
accepts two arguments:
firstiter
: a callable which will be called when an asynchronous generator is iterated for the first time.finalizer
: a callable which will be called when an asynchronous generator is about to be GCed.
When an asynchronous generator is iterated for the first time,
it stores a reference to the current finalizer. If there is none,
a RuntimeError
is raised. This provides a strong guarantee that
every asynchronous generator object will always have a finalizer
installed by the correct event loop.
When an asynchronous generator is about to be garbage collected,
it calls its cached finalizer. The assumption is that the finalizer
will schedule an aclose()
call with the loop that was active
when the iteration started.
For instance, here is how asyncio is modified to allow safe finalization of asynchronous generators::
# asyncio/base_events.py
class BaseEventLoop:
def run_forever(self):
...
old_hooks = sys.get_asyncgen_hooks()
sys.set_asyncgen_hooks(finalizer=self._finalize_asyncgen) try: ... finally: sys.set_asyncgen_hooks(*old_hooks) ...
def _finalize_asyncgen(self, gen):
self.create_task(gen.aclose())
The second argument, firstiter
, allows event loops to maintain
a weak set of asynchronous generators instantiated under their control.
This makes it possible to implement "shutdown" mechanisms to safely
finalize all open generators and close the event loop.
sys.set_asyncgen_hooks()
is thread-specific, so several event
loops running in parallel threads can use it safely.
sys.get_asyncgen_hooks()
returns a namedtuple-like structure
with firstiter
and finalizer
fields.
- Previous message (by thread): [Python-Dev] PEP 529: Change Windows filesystem encoding to UTF-8
- Next message (by thread): [Python-Dev] PEP 525, third round, better finalization
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]