PyMem_Raw* APIs should be thread safe. But in debug mode, they are not that simple and increment a global variable *serialno* and the incrementing action is not thread safe.
I looked at _Py_atomic_address to avoid atomic "serialno++", but we don't have atomic_fetch_add(). We could implement it using a loop and atomic_compare_exchange_strong()... but we don't have atomic_compare_exchange_strong() neither. I tried to add a mutex, but there are some pratical issues: * bpo-35388: question about calling Py_Initialize() / Py_Finalize() multiple times * I modified _PyRuntimeState_Init() to initialize the lock. _PyRuntimeState_Init() calls PyThread_acquire_lock() which calls PyMem_RawMalloc(). Problem: PyMem_RawMalloc() requires the lock. I worked around the isuse using "if (_PyRuntime.mem.mutex != NULL) {".
See also bpo-35265: "Internal C API: pass the memory allocator in a context". The most complex part is the Python initialization which changes the memory allocator *and* allocate memory. We have to remain which allocator has been used to allocate data, to be able to release memory at exit.
Python 3.8 has been fixed. I disabled serialno field by default: PYMEM_DEBUG_SERIALNO is not defined by default. You have to opt-in to get this bug :-) I don't see any easy fix older Python versions. I close the issue.
title: PyMem_Raw* API in debug mode are not thread safe -> Debug hooks on memory allocators are not thread safe (serialno variable)versions: + Python 2.7, Python 3.8