bpo-46771: Implement asyncio context managers for handling timeouts by asvetlov · Pull Request #31394 · python/cpython (original) (raw)

@1st1 Here's a test that fails without nonces:

async def test_nested_timeouts_concurrent(self): with self.assertRaises(TimeoutError): with asyncio.timeout(0.002): try: with asyncio.timeout(0.003): # Pretend we crunch some numbers. time.sleep(0.005) await asyncio.sleep(1) except asyncio.TimeoutError: pass

Both timeouts are marked as cancelled, but the inner timeout swallows the cancellation for the outer timeout. So the outer timeout never triggers.

Thinking about what happens here: When we reach the await asyncio.sleep(1), both callbacks run, both try to cancel the task, the inner __exit__() receives CancelledError, uncancels the task, and raises TimeoutError, which is called by the user code's except clause. Then the outer cancel scope's __exit__() is entered without an exception, and overall the task completes successfully.

The expectation is that the outer cancel scope should also raise TimeoutError, because its deadline is also exceeded. So perhaps the outer __exit__() should check whether its deadline is exceeded, and raise TimeoutError even though it did not see a CancelledError?

But it looks like there's a variant of the example that isn't fixed by such a check -- for example, we could add

after the try/except block (or even in the except clause), inside the outer cancel scope. Since the CancelledError exception has been wholly swallowed by the inner __exit__() this second await is not interrupted.

I'm guessing the fix for that would be to use a nonce (either using an explicit nonce API or via the cancel message). However, that still depends on the order in which the callbacks run. With the new cancel semantics (where extra cancels are ignored) whoever runs first wins, while with the old cancel semantics (where the most recent cancel gets to set the cancel message / nonce) whoever runs last wins -- but we don't want to rely on the order in which the callbacks run (since I have no idea in which order the current implementation runs them, given that they both become "ready" simultaneously, and we shouldn't depend on that). So that's why Tin is proposing to use timestamps (or a cancel stack, or some other mechanism that lets us arbitrate between the two cancel calls).

Note that this case is slightly different from the "web server cancels" problem, which was solvable without resorting to timestamps (the cancel scope callback would have to check whether the task was already being cancelled, using t.cancelling()).

I'm not sure yet what the right solution would be, but I'm glad that we're thingking about this scenario carefully.