Issue 34771: test_ctypes failing on Linux SPARC64 (original) (raw)

Python 3.6.6 on Linux 4.16.18 SPARC64 fails test_ctypes. Specifically, it appears to be due to the _testfunc_large_struct_update_value() or _testfunc_reg_struct_update_value():

0:00:44 load avg: 46.24 [137/407/1] test_ctypes failed -- running: test_socket (44 sec), test_subprocess (35 sec), test_venv (43 sec), test_normalization (43 sec), test_signal (43 sec), test_multiprocessing_spawn (43 sec), test_concurrent_futures (43 sec), test_email (34 sec), test_cmd_line_script (43 sec), test_tools (43 sec), test_pickletools (34 sec), test_zipfile (30 sec), test_multiprocessing_fork (33 sec), test_pyclbr (31 sec), test_math (42 sec), test_calendar (35 sec), test_datetime (33 sec), test_distutils (30 sec) test test_ctypes failed -- Traceback (most recent call last): File "/usr/src/dist/new/Python-3.6.6/Lib/ctypes/test/test_structures.py", line 416, in test_pass_by_value self.assertEqual(s.first, 0xdeadbeef) AssertionError: 195948557 != 3735928559

Obviously, the "0xbadf00d" field setting is propagating back up through something that's supposed to be passed-by-value, and the test case quite rightly picks up on it. I suspect this bug exists in 2.7.15 as well (2.7 just doesn't have the testcase to catch it).

This is built with gcc-8.2.0, glibc-2.27, kernel 4.16.18, CFLAGS="-O1 -mcpu=v9 -mtune=v9". (FYI I had to turn down optimization to resolve another test failure, hence the "-O1".)

I'm guessing SPARC64 calling conventions are still passing certain large values by reference, and libffi isn't dealing with this? I'm still investigating. I've tried it with and without --with-system-libffi, with no difference (my system libffi is 3.2.1).

Well, after perusing the ctypes callproc.c code, I found the hacks referenced by martin.panter and tried activating them with some SPARC64 #ifdefs:

--- python3.6-3.6.6.orig/Modules/_ctypes/callproc.c +++ python3.6-3.6.6/Modules/_ctypes/callproc.c @@ -1041,6 +1041,7 @@ GetComError(HRESULT errcode, GUID *riid, #endif

#if (defined(x86_64) && (defined(MINGW64) || defined(CYGWIN))) || \

#define CTYPES_PASS_BY_REF_HACK #define POW2(x) (((x & ~(x - 1)) == x) ? x : 0)

This is based on #ifdef checks in libffi, but somewhat more generalized. The good news is, this appears to resolve all test_ctypes failures. So I'm guessing this is necessary on Linux/SPARC64, though I can't tell if it's necessary for Solaris/SPARC64. I don't even know what built-in compiler defines get turned on for Solaris, though someone else might.

It might also be advisable to backport this to Python 2.7, but obviously we should also backport the additional ctypes tests if we do that.

My biggest concern is, do these hacks have a purely performance-centric impact, or do they potentially degrade functionality as well?