Issue 3900: ctypes: wrong calling convention for _string_at (original) (raw)
Our application server running on top of Twisted crashs 1 to 3 times per day. It uses a ctypes binding for libnetfilter_conntrack (dump Linux conntrack table) which is running in a dedicated thread. So we get:
- Python 2.5.2
- Twisted 8.1.0-3
- Linux 2.6.26-1-amd64 SMP x86_64
The crash does not occur in the "ctypes" thread but it the main thread (another CPython thread). The backtrace is incoherent which means that it's a multithreading problem. So I used helgrind (Valgrind tool) to watch invalid memory accesses, and here is one:
==30545== Possible data race during write of size 4 at 0x4EC1E60 ==30545== at 0x808F616: PyString_FromStringAndSize (stringobject.c:78) ==30545== by 0x4D3CBD9: string_at (_ctypes.c:4568) ==30545== by 0x4D4654E: ffi_call_SYSV (sysv.S:60) ==30545== by 0x4D46396: ffi_call (ffi.c:221) ==30545== by 0x4D3E9F1: _call_function_pointer (callproc.c:668) ==30545== by 0x4D3F147: _CallProc (callproc.c:991) ==30545== by 0x4D3B0DA: CFuncPtr_call (_ctypes.c:3373) ==30545== by 0x8060E0A: PyObject_Call (abstract.c:1861) ==30545== by 0x80CB391: do_call (ceval.c:3784) ==30545== by 0x80CAD69: call_function (ceval.c:3596) ==30545== by 0x80C7B6F: PyEval_EvalFrameEx (ceval.c:2272) ==30545== by 0x80C9329: PyEval_EvalCodeEx (ceval.c:2836) ==30545== Old state: shared-readonly by threads #1, #4 ==30545== New state: shared-modified by threads #1, #4 ==30545== Reason: this thread, #1, holds no consistent locks ==30545== Location 0x4EC1E60 has never been protected by any lock
In _CallProc() the test ((flags & FUNCFLAG_PYTHONAPI) == 0) is True, which means that the GIL is released. But it's a bug because as you can see, string_at() uses PyString_FromStringAndSize() which requires the GIL!
Finally, the bug comes from ctypes module, not _ctypes: ctypes just uses the wrong calling convention. Using PYFUNCPTR() instead of CFUNCPTR(), the Helgrind warning goes away ;-)
Note about Helgrind: This tools really rocks!!!