[Python-Dev] A more flexible task creation (original) (raw)

Michel Desmoulin desmoulinmichel at gmail.com
Wed Jun 13 16:45:22 EDT 2018


I was working on a concurrency limiting code for asyncio, so the user may submit as many tasks as one wants, but only a max number of tasks will be submitted to the event loop at the same time.

However, I wanted that passing an awaitable would always return a task, no matter if the task was currently scheduled or not. The goal is that you could add done callbacks to it, decide to force schedule it, etc

I dug in the asyncio.Task code, and encountered:

def __init__(self, coro, *, loop=None):
    ...
    self._loop.call_soon(self._step)
    self.__class__._all_tasks.add(self)

I was surprised to see that instantiating a Task class has any side effect at all, let alone 2, and one of them being to be immediately scheduled for execution.

I couldn't find a clean way to do what I wanted: either you loop.create_task() and you get a task but it runs, or you don't run anything, but you don't get a nice task object to hold on to.

I tried several alternatives, like returning a future, and binding the future awaiting to the submission of a task, but that was complicated code that duplicated a lot of things.

I tried creating a custom task, but it was even harder, setting a custom event policy, to provide a custom event loop with my own create_task() accepting parameters. That's a lot to do just to provide a parameter to Task, especially if you already use a custom event loop (e.g: uvloop). I was expecting to have to create a task factory only, but task factories can't get any additional parameters from create_task()).

Additionally I can't use ensure_future(), as it doesn't allow to pass any parameter to the underlying Task, so if I want to accept any awaitable in my signature, I need to provide my own custom ensure_future().

All those implementations access a lot of _private_api, and do other shady things that linters hate; plus they are fragile at best. What's more, Task being rewritten in C prevents things like setting self._coro, so we can only inherit from the pure Python slow version.

In the end, I can't even await the lazy task, because it blocks the entire program.

Hence I have 2 distinct, but independent albeit related, proposals:

I insist on the fact that the 2 proposals are independent, so please don't reject both if you don't like one or the other. Passing a parameter to the underlying custom Task is still of value even without the unscheduled instantiation, and vice versa.

Also, if somebody has any idea on how to make a LazyTask that we can await on without blocking everything, I'll take it.



More information about the Python-Dev mailing list