msg355223 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2019-10-23 14:44 |
AMD64 FreeBSD Shared 3.x: https://buildbot.python.org/all/#/builders/371/builds/7 This buildbot worker is known to be very slow. test_run_coroutine_threadsafe_with_timeout (test.test_asyncio.test_tasks.RunCoroutineThreadsafeTests) Test coroutine submission from a thread to an event loop ... FAIL FAIL: test_run_coroutine_threadsafe_with_timeout (test.test_asyncio.test_tasks.RunCoroutineThreadsafeTests) Test coroutine submission from a thread to an event loop ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/home/buildbot/python/3.x.koobs-freebsd-564d/build/Lib/test/test_asyncio/test_tasks.py", line 3210, in test_run_coroutine_threadsafe_with_timeout self.loop.run_until_complete(future) AssertionError: TimeoutError not raised |
|
|
msg355227 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2019-10-23 15:01 |
test_run_coroutine_threadsafe_with_timeout() has a race condition. It relies on "await asyncio.sleep(0.05)" synchronization primitive. That's not reliable. async def add(self, a, b, fail=False, cancel=False): """Wait 0.05 second and return a + b.""" await asyncio.sleep(0.05) if fail: raise RuntimeError("Fail!") if cancel: asyncio.current_task(self.loop).cancel() await asyncio.sleep(0) return a + b |
|
|
msg355228 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2019-10-23 15:03 |
I can reproduce the issue with this patch: diff --git a/Lib/test/test_asyncio/test_tasks.py b/Lib/test/test_asyncio/test_tasks.py index dde84b84b1..c94113712a 100644 --- a/Lib/test/test_asyncio/test_tasks.py +++ b/Lib/test/test_asyncio/test_tasks.py @@ -3160,7 +3160,7 @@ class RunCoroutineThreadsafeTests(test_utils.TestCase): async def add(self, a, b, fail=False, cancel=False): """Wait 0.05 second and return a + b.""" - await asyncio.sleep(0.05) + await asyncio.sleep(1e-9) if fail: raise RuntimeError("Fail!") if cancel: and the command: ./python -m test test_asyncio -m test_run_coroutine_threadsafe_with_timeout -v -F |
|
|
msg355360 - (view) |
Author: Kyle Stanley (aeros) *  |
Date: 2019-10-25 11:53 |
I can confirm Victor's method of reproducing the failure consistently, by using asyncio.sleep(1e-9) within `RunCoroutineThreadsafeTests.add()` instead of the current asyncio.sleep(0.05). I also experimented with adjusting the sleep time, to figure out the "breaking point" where I could no longer run `./python -m test test_asyncio -m test_run_coroutine_threadsafe_with_timeout -v -F` without failures (within ~10,000 tests). From my results, the lowest reliable value was 0.001. At 1e-4, I was able to consistently reproduce the failure reported above: FAIL: test_run_coroutine_threadsafe_with_timeout (test.test_asyncio.test_tasks.RunCoroutineThreadsafeTests) Test coroutine submission from a thread to an event loop ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/aeros/repos/cpython/Lib/test/test_asyncio/test_tasks.py", line 3210, in test_run_coroutine_threadsafe_with_timeout self.loop.run_until_complete(future) AssertionError: TimeoutError not raised The failure becomes increasingly consistent with lowered time as expected, but at 1e-5 I was able to repeatedly reproduce the failure within 10 iterations of the test. At 1e-4 it took around 3000 iterations before failing (across multiple runs). |
|
|
msg355361 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2019-10-25 12:15 |
IHMO a test must not depend on time. |
|
|
msg355365 - (view) |
Author: Kyle Stanley (aeros) *  |
Date: 2019-10-25 12:53 |
> IHMO a test must not depend on time. I would agree, the tests should optimally not depend on time. My above comment was primarily to confirm that the failure was able to be consistently reproduced, along with the minimum conditions to do so. Yury had been the one to change it to "asyncio.sleep(0.05)" from the previous "asyncio.sleep(1)" in the following commit: https://github.com/python/cpython/commit/abe9625eeb71e40f042ccfccfe6a4489a6dcdf35 (Nov 13, 2015). Perhaps he might have some insight or ideas as to how we could improve the test to use a more reliable means of synchronization that has been implemented since that change was made. I'll add him to the nosy list. |
|
|