[Python-Dev] Python initialization and embedded Python (original) (raw)

Victor Stinner victor.stinner at gmail.com
Fri Nov 17 19:01:47 EST 2017


Hi,

The CPython internals evolved during Python 3.7 cycle. I would like to know if we broke the C API or not.

Nick Coghlan and Eric Snow are working on cleaning up the Python initialization with the "on going" PEP 432: https://www.python.org/dev/peps/pep-0432/

Many global variables used by the "Python runtime" were move to a new single "_PyRuntime" variable (big structure made of sub-structures). See Include/internal/pystate.h.

A side effect of moving variables from random files into header files is that it's not more possible to fully initialize _PyRuntime at "compilation time". For example, previously, it was possible to refer to local C function (functions declared with "static", so only visible in the current file). Now a new "initialization function" is required to must be called.

In short, it means that using the "Python runtime" before it's initialized by _PyRuntime_Initialize() is now likely to crash. For example, calling PyMem_RawMalloc(), before calling _PyRuntime_Initialize(), now calls the function NULL: dereference a NULL pointer, and so immediately crash with a segmentation fault.

I'm writing this email to ask if this change is an issue or not to embedded Python and the Python C API. Is it still possible to call "all" functions of the C API before calling Py_Initialize()?

I was bitten by the bug while reworking the Py_Main() function to split it into subfunctions and cleanup the code to handle the command line arguments and environment variables. I fixed the issue in main() by calling _PyRuntime_Initialize() as soon as possible: it's now the first instruction of main() :-) (See Programs/python.c)

To give a more concrete example: Py_DecodeLocale() is the recommanded function to decode bytes from the operating system, but this function calls PyMem_RawMalloc() which does crash before _PyRuntime_Initialize() is called. Is Py_DecodeLocale() used to initialize Python?

For example, "void Py_SetProgramName(wchar_t *);" expects a text string, whereas main() gives argv as bytes. Calling Py_SetProgramName() from argv requires to decode bytes... So use Py_DecodeLocale()...

Should we do something in Py_DecodeLocale()? Maybe crash if _PyRuntime_Initialize() wasn't called yet?

Maybe, the minimum change is to expose _PyRuntime_Initialize() in the public C API?

Victor



More information about the Python-Dev mailing list