Make asyncio done callbacks consistent between Futures and Tasks (original) (raw)
February 16, 2024, 8:14pm 1
With asyncio tasks, the done callbacks are called by the time they are awaited. However, a future’s callbacks are instead scheduled separately on the event loop with call_soon
. So by the time a future has been awaited, its done callback has not been called.
Having used tasks more frequently, I found this really unintuitive. The problem I ran into is an event loop that was starved by many already-complete futures. Even though I’m calling await future
, its not actually awaiting anything and going back to the event loop; so all those done callbacks kept piling up and eating up memory.
My proposal would be to make futures resolve their callbacks whenever the future is first awaited. That gives the same behavior as tasks, its intuitively how we think of await (e.g. all processing, including callbacks have completed), and doesn’t have the problem with callbacks piling up.
Simple example of the inconsistency:
import asyncio
async def main():
# using future
future = asyncio.get_running_loop().create_future()
future.add_done_callback(lambda r: print("future done callback"))
future.set_result("done")
print("future processed")
await future
print("future awaited (1)")
await future
print("future awaited (2)")
# using task
async def coroutine():
print("task processed")
task = asyncio.create_task(coroutine())
task.add_done_callback(lambda r: print("task done callback"))
await task
print("task awaited (1)")
await task
print("task awaited (2)")
asyncio.run(main())
Output:
future processed
future awaited (1)
future awaited (2)
future done callback
task processed
task done callback
task awaited (1)
task awaited (2)
elis.byberi (Elis Byberi) February 16, 2024, 10:57pm 2
…the callback is called as soon as the task is done, regardless of whether you have awaited the task or not.
elis.byberi (Elis Byberi) February 17, 2024, 9:39pm 3
So, the task need not be awaited; that’s why it is executed soon after it is created. On the other hand, the future needs to be awaited, and as per scheduling of the callback, I believe it’s the right choice to schedule them to be executed in the event loop’s next iteration. That will help make the program non-blocking.
alicederyn (Alice) February 17, 2024, 10:43pm 4
I’m not sure “we” think of await this way necessarily. I think of callbacks as being cleanup work that should happen soon after the task/future is done but with no guarantee as to absolute ordering. If I wanted to guarantee ordering, I would expect to wrap the task/future in another task/future that guaranteed the “callback” work was done.
Would wrapping solve your issues? Without knowing what you are placing in your callbacks it’s hard to know in the abstract why you are expecting them to work this way.
Azmisov (Isaac Nygaard) February 18, 2024, 2:41am 5
Alright, it looks like I was mistaken… futures and tasks are consistent after all.
I had assumed a task’s done_callback
s were called in a blocking manner immediately when the task completes. I had observed behavior consistent with that assumption and made a generalization that that was always the case. Then when using futures (which are optimized to not yield to the event loop on await), I observed done_callback
being called in the non-blocking call_soon
manner. But I did some more precise tests now and I’m seeing tasks are calling their callbacks using call_soon
as well (probably using futures internally I guess).
@alicederyn Here was the issue I ran into, essentially something like this:
# data listener
def next_data()
data = loop.create_future()
data.set_done_callback(free_queued_resources)
# immediately available? (branch often taken)
if data_already_queued:
data.set_result(queued_data.copy())
else:
provide_when_available(data)
return data
# data consumer
while True:
await next_data()
The await next_data()
doesn’t yield to the event loop, since data is already available. So the done callback (freeing resources) is starved, never run, and memory usage keeps increasing. The code was written on the assumption that done_callback
s get called by the time the future is awaited, which is why we get the problem. The code’s already been altered to fix the issue, so no need to troubleshoot. I was mainly posting to mention this behavior was unintuitive and seemed (falsely) to be inconsistent with Task behavior…
alicederyn (Alice) February 18, 2024, 9:48am 6
Thanks for the update!