Issue 36116: test_multiprocessing_spawn fails on AMD64 Windows8 3.x (original) (raw)

Created on 2019-02-26 07:13 by pablogsal, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (12)
msg336625 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-02-26 07:13
test test_multiprocessing_spawn failed test_import (test.test_multiprocessing_spawn._TestImportStar) ... ok ====================================================================== FAIL: test_mymanager_context (test.test_multiprocessing_spawn.WithManagerTestMyManager) ---------------------------------------------------------------------- Traceback (most recent call last): File "D:\buildarea\3.x.bolen-windows8\build\lib\test\_test_multiprocessing.py", line 2747, in test_mymanager_context self.assertIn(manager._process.exitcode, (0, -signal.SIGTERM)) AssertionError: 3221225477 not found in (0, -15) ---------------------------------------------------------------------- Ran 344 tests in 328.196s FAILED (failures=1, skipped=40) 1 test failed again: test_multiprocessing_spawn == Tests result: FAILURE then FAILURE == https://buildbot.python.org/all/#/builders/32/builds/2204/steps/3/logs/stdio
msg336626 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-02-26 07:21
It also fails on Windows 7: https://buildbot.python.org/all/#/builders/58/builds/1983/steps/3/logs/stdio
msg336627 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-02-26 07:28
It seems that return code 3221225477 in Windows is: #define STATUS_ACCESS_VIOLATION ((NTSTATUS)0xC0000005L) so this is a segfault in the manager.
msg336774 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-02-27 17:30
Adding Ɓukasz, as I think this is a release blocker (the Windows 8 and 7 multiprocessing module may be causing segfaults).
msg336778 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2019-02-27 19:42
It's also possible that the child process is causing the segfault because of misconfiguration (e.g. broken environment variables). And depending on the OS, abort() calls (via Py_FatalError) sometimes appear to be segfaults, so it could be any number of issues. (Aside - I'd love to replace the abort() calls with specific exit codes for configuration errors - they really mess up the crash data we see on Windows.) I'll try some tests locally to see if this is reproducible, but if anyone can extract the original stdout/stderr from the buildbot, that would be helpful.
msg336782 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2019-02-27 22:19
> And depending on the OS, abort() calls (via Py_FatalError) sometimes > appear to be segfaults, so it could be any number of issues. > (Aside - I'd love to replace the abort() calls with specific exit > codes for configuration errors - they really mess up the crash data > we see on Windows.) In particular, with the Universal CRT, an unhandled abort() is implemented by a __fastfail intrinsic [1] (int 0x29 instruction in x86) with the argument FAST_FAIL_FATAL_APP_EXIT (7). Prior to Windows 8 this appears as an access violation. In Windows 8+ it's implemented as a second-chance STATUS_STACK_BUFFER_OVERRUN (0xC0000409) exception, which is overloaded from its previous use to support failure codes. (The old usage appears as the failure code FAST_FAIL_LEGACY_GS_VIOLATION, defined to be 0.) It starts as a second-chance exception in order to bypass normal exception handling (i.e. SEH, VEH, UnhandledExceptionFilter). The second-chance exception event is sent to an attached debugger and/or the session server (csrss.exe). Python's normal signal handling for SIGABRT can't prevent this, since the C handler just sets a flag and returns. But enabling faulthandler sets a C signal handler that restores the previous handler and calls raise(SIGABRT). The default SIGABRT handler for the explicit raise() code path simply calls _exit(3). Alternatively, we could prevent the __fastfail call via _set_abort_behavior [2], if implemented in msvcrt. For example: msvcrt.set_abort_behavior(0, msvcrt.CALL_REPORTFAULT). [1]: https://docs.microsoft.com/en-us/cpp/intrinsics/fastfail [2]: https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/set-abort-behavior
msg336928 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-03-01 16:29
https://buildbot.python.org/all/#/builders/32/builds/2219 FAIL: test_mymanager_context_prestarted (test.test_multiprocessing_spawn.WithManagerTestMyManager) Re-running failed tests in verbose mode Re-running test 'test_multiprocessing_spawn' in verbose mode FAIL: test_mymanager_context_prestarted (test.test_multiprocessing_spawn.WithManagerTestMyManager)
msg337039 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-03-03 17:54
See also https://bugs.python.org/issue36114
msg337072 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-03-04 08:13
> It's also possible that the child process is causing the segfault because of misconfiguration (e.g. broken environment variables). Maybe, but the test also produces core dump on FreeBSD: bpo-36114. It looks more like a real bug. I set the priority again to release blocker to not forget this regression.
msg337077 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-03-04 08:41
test_mymanager and test_mymanager_context of test_multiprocessing_spawn.WithManagerTestMyManager failed in this build: > https://buildbot.python.org/all/#/builders/58/builds/1983/steps/3/logs/stdio ERROR: test_multiprocessing (test.test_venv.BasicTest) FAIL: test_async_gen_asyncio_gc_aclose_09 (test.test_asyncgen.AsyncGenAsyncioTest) FAIL: test_daemon_threads_shutdown_stderr_deadlock (test.test_io.CMiscIOTest) self.assertIn("Fatal Python error: could not acquire lock " AssertionError: "Fatal Python error: could not acquire lock for <_io.BufferedWriter name=''> at interpreter shutdown, possibly due to daemon threads" not found in (...) FAIL: test_daemon_threads_shutdown_stdout_deadlock (test.test_io.CMiscIOTest) self.assertIn("Fatal Python error: could not acquire lock " AssertionError: "Fatal Python error: could not acquire lock for <_io.BufferedWriter name=''> at interpreter shutdown, possibly due to daemon threads" not found in '' Re-running failed tests in verbose mode Re-running test 'test_venv' in verbose mode ERROR: test_multiprocessing (test.test_venv.BasicTest) Re-running test 'test_asyncgen' in verbose mode Re-running test 'test_multiprocessing_spawn' in verbose mode FAIL: test_mymanager (test.test_multiprocessing_spawn.WithManagerTestMyManager) FAIL: test_mymanager_context (test.test_multiprocessing_spawn.WithManagerTestMyManager) Re-running test 'test_io' in verbose mode FAIL: test_daemon_threads_shutdown_stderr_deadlock (test.test_io.CMiscIOTest) self.assertIn("Fatal Python error: could not acquire lock " AssertionError: "Fatal Python error: could not acquire lock for <_io.BufferedWriter name=''> at interpreter shutdown, possibly due to daemon threads" not found in (...) FAIL: test_daemon_threads_shutdown_stdout_deadlock (test.test_io.CMiscIOTest) self.assertIn("Fatal Python error: could not acquire lock " AssertionError: "Fatal Python error: could not acquire lock for <_io.BufferedWriter name=''> at interpreter shutdown, possibly due to daemon threads" not found in ''
msg337221 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2019-03-05 16:16
This is resolved with gh-12159, no?
msg337265 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-03-06 01:37
> This is resolved with gh-12159, no? I was waiting to see if buildbot workers feel better. It's the case, so I close the issue.
History
Date User Action Args
2022-04-11 14:59:11 admin set github: 80297
2019-03-06 01:37:25 vstinner set status: open -> closedpriority: release blocker -> normalmessages: + resolution: fixedstage: resolved
2019-03-05 16:16:39 eric.snow set nosy: + eric.snowmessages: +
2019-03-04 08:41:01 vstinner set messages: +
2019-03-04 08:13:27 vstinner set priority: high -> release blockermessages: +
2019-03-03 17:54:44 pablogsal set messages: +
2019-03-01 16:29:58 vstinner set messages: +
2019-02-27 22:19:58 eryksun set nosy: + eryksunmessages: +
2019-02-27 19:42:44 steve.dower set messages: +
2019-02-27 17:30:02 pablogsal set priority: normal -> highnosy: + lukasz.langamessages: +
2019-02-26 12:26:22 vstinner set nosy: + vstinner
2019-02-26 07:28:34 pablogsal set nosy: + pitroumessages: +
2019-02-26 07:21:47 pablogsal set messages: +
2019-02-26 07:13:43 pablogsal create