[Python-Dev] PEP 525, third round, better finalization (original) (raw)
Yury Selivanov yselivanov.ml at gmail.com
Sat Sep 3 15:13:14 EDT 2016
- Previous message (by thread): [Python-Dev] PEP 525, third round, better finalization
- Next message (by thread): [Python-Dev] Need help in debugging the python core
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi Nathaniel,
On 2016-09-02 2:13 AM, Nathaniel Smith wrote:
On Thu, Sep 1, 2016 at 3:34 PM, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
Hi,
I've spent quite a while thinking and experimenting with PEP 525 trying to figure out how to make asynchronous generators (AG) finalization reliable. I've tried to replace the callback for GCed with a callback to intercept first iteration of AGs. Turns out it's very hard to work with weak-refs and make asyncio event loop to reliably track and shutdown all open AGs. My new approach is to replace the "sys.setasyncgenfinalizer(finalizer)" function with "sys.setasyncgenhooks(firstiter=None, finalizer=None)". 1) Can/should these hooks be used by other types besides async generators? (e.g., async iterators that are not async generators?) What would that look like?
Asynchronous iterators (classes implementing aiter, anext) should use del for any cleanup purposes.
sys.set_asyncgen_hooks only supports asynchronous generators.
2) In the asyncio design it's legal for an event loop to be stopped and then started again. Currently (I guess for this reason?) asyncio event loops do not forcefully clean up resources associated with them on shutdown. For example, if I open a StreamReader, loop.stop() and loop.close() will not automatically close it for me. When, concretely, are you imagining that asyncio will run these finalizers?
I think we will add another API method to asyncio event loop, which
users will call before closing the loop. In my reference implementation
I added loop.shutdown()
synchronous method.
3) Should the cleanup code in the generator be able to distinguish between "this iterator has left scope" versus "the event loop is being violently shut down"?
This is already handled in the reference implementation. When an AG is iterated for the first time, the loop starts tracking it by adding it to a weak set. When the AG is about to be GCed, the loop removes it from the weak set, and schedules its 'aclose()'.
If 'loop.shutdown' is called it means that the loop is being "violently shutdown", so we schedule 'aclose' for all AGs in the weak set.
4) More fundamentally -- this revision is definitely an improvement, but it doesn't really address the main concern I have. Let me see if I can restate it more clearly. Let's define 3 levels of cleanup handling: Level 0: resources (e.g. file descriptors) cannot be reliably cleaned up. Level 1: resources are cleaned up reliably, but at an unpredictable time. Level 2: resources are cleaned up both reliably and promptly. In Python 3.5, unless you're very anal about writing cumbersome 'async with' blocks around every single 'async for', resources owned by aysnc iterators land at level 0. (Because the only cleanup method available is del, and del cannot make async calls, so if you need async calls to do clean up then you're just doomed.) I think at the revised draft does a good job of moving async generators from level 0 to level 1 -- the finalizer hook gives a way to effectively call back into the event loop from del, and the shutdown hook gives us a way to guarantee that the cleanup happens while the event loop is still running. Right. It's good to hear that you agree that the latest revision of the PEP makes AGs cleanup reliable (albeit unpredictable when exactly that will happen, more on that below).
My goal was exactly this - make the mechanism reliable, with the same predictability as what we have for del.
But... IIUC, it's now generally agreed that for Python code, level 1 is simply not good enough. (Or to be a little more precise, it's good enough for the case where the resource being cleaned up is memory, because the garbage collector knows when memory is short, but it's not good enough for resources like file descriptors.) The classic example of this is code like:
I think this is where I don't agree with you 100%. There are no strict guarantees when an object will be GCed in a timely manner in CPython or PyPy. If it's part of a ref cycle, it might not be cleaned up at all.
All in all, in all your examples I don't see the exact place where AGs are different from let's say synchronous generators.
For instance:
async def readjsonlinesfromserver(host, port): async for line in asyncio.openconnection(host, port)[0]: yield json.loads(line)
You would expect to use this like: async for data in readjsonlinesfromserver(host, port): ...
If you rewrite the above code without the 'async' keyword, you'd have a synchronous generator with exactly the same problems.
tl;dr: AFAICT this revision of PEP 525 is enough to make it work reliably on CPython, but I have serious concerns that it bakes a CPython-specific design into the language. I would prefer a design that actually aims for "level 2" cleanup semantics (for example, [1])
I honestly don't see why PEP 525 can't be implemented in PyPy. The finalizing mechanism is built on top of existing finalization of synchronous generators which is already implemented in PyPy.
The design of PEP 525 doesn't exploit any CPython-specific features (like ref counting). If an alternative implementation of Python interpreter implements del semantics properly, it shouldn't have any problems with implementing PEP 525.
Thank you, Yury
- Previous message (by thread): [Python-Dev] PEP 525, third round, better finalization
- Next message (by thread): [Python-Dev] Need help in debugging the python core
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]