[Python-Dev] A more flexible task creation (original) (raw)

Michel Desmoulin desmoulinmichel at gmail.com
Wed Jul 11 09:13:02 EDT 2018


To be honest, I see "async with" being abused everywhere in asyncio, lately.  I like to have objects with start() and stop() methods, but everywhere I see async context managers.> Fine, add nursery or whatever, but please also have a simple start() / stop() public API.

"async with" is only good for functional programming.  If you want to go more of an object-oriented style, you tend to have start() and stop() methods in your classes, which will call start() & stop() (or close()) methods recursively on nested resources.  So of the libraries (aiopg, I'm looking at you) don't support start/stop or open/close well.

Wouldn't calling enter and exit manually works for you ? I started coding begin() and stop(), but I removed them, as I couldn't find a use case for them.

And what exactly is the use case that doesn't work with async with ? The whole point is to spot the boundaries of the tasks execution easily. If you start()/stop() randomly, it kinda defeat the purpose.

It's a genuine question though. I can totally accept I overlooked a valid use case.

I tend to slightly agree, but OTOH if asyncio had been designed to not schedule tasks automatically on init I bet there would have been other users complaining that "why didn't task XX run?", or "why do tasks need a start() method, that is clunky!".  You can't please everyone...

Well, ensure_future([schedule_immediatly=True]) and asyncio.create_task([schedule_immediatly=True] would take care of that. They are the entry point for the task creation and scheduling.

Also, in tasklist = run.all(foo(), foo(), foo()) As soon as you call foo(), you are instantiating a coroutine, which consumes memory, while the task may not even be scheduled for a long time (if you have 5000 potential tasks but only execute 10 at a time, for example).

Yes but this has the benefit of accepting any awaitable, not just coroutine. You don't have to wonder what to pass, or which form. It's always the same. Too many APi are hard to understand because you never know if it accept a callback, a coroutine function, a coroutine, a task, a future...

For the same reason, request.get() create and destroys a session every time. It's inefficient, but way easier to understand, and fits the majority of the use cases.

But if you do as Yuri suggested, you'll instead accept a function reference, foo, which is a singleton, you can have many foo references to the function, but they will only create coroutine objects when the task is actually about to be scheduled, so it's more efficient in terms of memory.

I made some test, and the memory consumption is indeed radically smaller if you just store references if you just compare it to the same unique raw coroutine.

However, this is a rare case. It assumes that:

It's a very specific narrow case. Also, everything you store on the scope will be wrapped into a Future object no matter if it's scheduled or not, so that you can cancel it later. So the scale of the memory consumption is not as much.

I didn't want to compromise the quality of the current API for the general case for an edge case optimization.

On the other hand, this is a low hanging fruit and on plateforms such as raspi where asyncio has a lot to offer, it can make a big difference to shave up 20 of memory consumption of a specific workload.

So I listened and implemented an escape hatch:

import random import asyncio

import ayo

async def zzz(seconds): await asyncio.sleep(seconds) print(f'Slept for {seconds} seconds')

@ayo.run_as_main() async def main(run_in_top):

async with ayo.scope(max_concurrency=10) as run:
    for _ in range(10000):
        run.from_callable(zzz, 0.005) # or run.asap(zzz(0.005))

This would only lazily create the awaitable (here the coroutine) on scheduling. I see a 15% of memory saving for the WHOLE program if using from_callable().

So definitly a good feature to have, thank you.

But again, and I hope Yuri is reading this because he will implement that for uvloop, and this will trickles down to asyncio, I think we should not compromise the main API for this.

asyncio is hard enough to grok, and too many concepts fly around. The average Python programmer has been experienced way easier things from past Python encounter.

If we want, one day, that asyncio is consider the clean AND easy way to do async, we need to work on the API.

asyncio.run() is a step in the right direction (although again I wish we implemented that 2 years ago when I talked about it instead of telling me no).

Now if we add nurseries, it should hide the rest of the complexity. Not add to it.



More information about the Python-Dev mailing list