Issue 26162: thread error - Python tracker (original) (raw)

It's no surprise to see a memory error at 845 threads if you're using 32-bit Python, which is limited to 2 GiB of address space. python.exe is built with a default stack reservation of 2000000 bytes, so 845 threads reserve a total of about 1.58 GiB. Consider also that address space is used by mapped DLLs and files, page tables, private data, the process heap(s), and additional worker threads. Plus if it's 64-bit Windows there's a 64-bit stack for each 32-bit thread. The 64-bit stack reserves 256 KiB, which totals an additional 0.2 GiB for 845 threads. Available space is also lost to fragmentation that leaves unusable blocks.

Why do you need so many threads instead of using a thread pool? If you really need that many threads and can't switch to 64-bit Python, you'll have to step outside what Python's built-in threading support offers. The following demonstrates calling Windows CreateThread via ctypes. I recommend against doing something like this.

import ctypes
from ctypes import wintypes

kernel32 = ctypes.WinDLL('kernel32', use_last_error=True)

CREATE_SUSPENDED = 0x00000004
STACK_SIZE_PARAM_IS_A_RESERVATION = 0x00010000

def check_handle(result, func, args):
    if result is None:
        raise ctypes.WinError(ctypes.get_last_error())
    return args

def check_bool(result, func, args):
    if not result:
        raise ctypes.WinError(ctypes.get_last_error())
    return None

wintypes.LPDWORD = ctypes.POINTER(wintypes.DWORD)
wintypes.SIZE_T = ctypes.c_size_t
LPSECURITY_ATTRIBUTES = ctypes.c_void_p
LPTHREAD_START_ROUTINE = ctypes.WINFUNCTYPE(wintypes.DWORD,
                                            wintypes.LPVOID)

kernel32.CreateThread.errcheck = check_handle
kernel32.CreateThread.restype = wintypes.HANDLE
kernel32.CreateThread.argtypes = (
    LPSECURITY_ATTRIBUTES,  # _In_opt_  lpThreadAttributes
    wintypes.SIZE_T,        # _In_      dwStackSize
    LPTHREAD_START_ROUTINE, # _In_      lpStartAddress
    wintypes.LPVOID,        # _In_opt_  lpParameter
    wintypes.DWORD,         # _In_      dwCreationFlags
    wintypes.LPDWORD,       # _Out_opt_ lpThreadId
)

kernel32.CloseHandle.errcheck = check_bool
kernel32.CloseHandle.argtypes = (wintypes.HANDLE,)

def CreateThread(lpStartAddress,
                 lpParameter=None,
                 dwStackSize=0,
                 dwCreationFlags=STACK_SIZE_PARAM_IS_A_RESERVATION,
                 lpThreadAttributes=None,
                 close_handle=True):
    tid = (wintypes.DWORD * 1)()
    h = kernel32.CreateThread(lpThreadAttributes,
                              dwStackSize,
                              lpStartAddress,
                              lpParameter,
                              dwCreationFlags,
                              tid)
    if close_handle:
        kernel32.CloseHandle(h)
        return tid[0]
    return h, tid[0]

For example, the following creates 2000 worker threads that each reserves 256 KiB for the stack. Since I'm running 32-bit Python on 64-bit Windows, each thread has a corresponding 64-bit thread stack that also reserves 256 KiB. In total it reserves 1 GiB of stack space, which I confirmed using Sysinternals VMMap.

import time

@LPTHREAD_START_ROUTINE
def worker(param):
    if param is None:
        param = 0
    # do some work
    time.sleep(60)
    return 0

tids = []
for i in range(2000):
    tid = CreateThread(worker, i, 256*1024)
    tids.append(tid)