bpo-32124: Document C functions safe before init by vstinner · Pull Request #4540 · python/cpython (original) (raw)

vstinner

@vstinner

Explicitly document C functions and C variables that can be set before Py_Initialize().

@vstinner

Change also how functions are listed: one per line.

Mariatta

* :c:func:`PyMem_RawMalloc`
* :c:func:`PyMem_RawRealloc`
:c:func:`PyMem_RawCalloc`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a * in the beginning, like the rest?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo, fixed

serhiy-storchaka

* :c:func:`PyObject_Calloc`
* :c:func:`PyObject_Free`
The following flags can also be set:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is not this the all flags?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the text.

be called before using any other Python/C API functions; with the exception of
a few functions and flags listed below.
The following functions can be safetely called before Python is initialized:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyImport_AppendInittab(), PyImport_ExtendInittab(), PyInitFrozenExtensions().

* :c:func:`PyObject_SetArenaAllocator`
* :c:func:`Py_DecodeLocale`
* :c:func:`Py_GetProgramName`
* :c:func:`Py_SetPath`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Py_GetProgramName is included, shouldn't other Py_Get* functions be included? Py_GetPath(), Py_GetVersion(), and many others.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Py_GetPath() must not be called before Py_Initialize().

I added other Py_GetXXX() functions.

.. :c:data: Py_BytesWarningFlag
The :option:`-b` option.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrong indentation (should be 3 spaces).

Document what is the value if use -bb. And the same for -OO below.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

and options. By default, these flags are controlled by :ref:`command line
options `.
.. :c:data: Py_BytesWarningFlag

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.. c:var:: Py_BytesWarningFlag

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@vstinner

@vstinner

vstinner

Private flag used by ``_freeze_importlib`` and ``frozenmain`` programs.
.. c:var:: Py_UseClassExceptionsFlag

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

Mariatta

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noticed a typo and grammar issues :)

be called before using any other Python/C API functions; with the exception of
a few functions and the :ref:`global configuration variables `:
The following functions can be safetely called before Python is initialized:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo :) safetely -> safely

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Crap. "safetely" looked weird, so I used my spell check: "Google safetely", but I found results. Maybe there is an issue with my spell checker? :-)

.. c:var:: Py_DebugFlag
Turn on parser debugging output (for wizards only, depending on compilation

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the word wizards and thought of magicians :) Do you mean something like setup wizard?
Or only a real wizard/magician can use this flag? 😅
Maybe this is a common terminology and I'm just too new.. 🤷‍♀️

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Victor meant real wizards, like his and me. 🎩

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok in that case I suggest something less ambiguous, like "for expert" or "for experienced core devs" only.
This sentence also missed a period (.) at the end.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wording already is used two times in the documentation. In particularly in the documentation for the -d option. This is very old documentation.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In sum, this option is only for those who understand this jargon. 😉

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied the doc from the -d option documentation.

Ok, I replaced "for wizards only" with "for expert only".

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would keep the original wording. It is used also in the man page since the initial edition in 1994.

Set to ``1`` if the :envvar:`PYTHONLEGACYWINDOWSFSENCODING` environment
variable is set to a non-empty string.
See also the :pep:`529`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I picked: See :pep:529 for more details.

Set to ``1`` if the :envvar:`PYTHONLEGACYWINDOWSSTDIO` environment
variable is set to a non-empty string.
See also the :pep:`528`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove "the".

@bedevere-bot

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

serhiy-storchaka

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about adding an explicit note in the description of all of these functions that they can be called before Py_Initialize()?

Set to ``1`` if the :envvar:`PYTHONLEGACYWINDOWSFSENCODING` environment
variable is set to a non-empty string.
See also the :pep:`529`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See :pep:`529` for more details.
* :c:func:`PyMem_SetupDebugHooks`
* :c:func:`PyObject_GetArenaAllocator`
* :c:func:`PyObject_SetArenaAllocator`
* :c:func:`Py_GetBuildInfo`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would group Py_GetBuildInfo() etc as "informative functions".

.. c:var:: Py_DebugFlag
Turn on parser debugging output (for wizards only, depending on compilation

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Victor meant real wizards, like his and me. 🎩

@@ -1092,7 +1315,7 @@ Python-level trace functions in previous versions.
+------------------------------+--------------------------------------+
.. c:var:: int PyTrace_CALL
.. c:var::: int PyTrace_CALL

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the third colon redundant?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, it's a mistake. Fixed.

@vstinner

@vstinner

I have made the requested changes; please review again.

@bedevere-bot

Thanks for making the requested changes!

@Mariatta: please review the changes made to this pull request.

@vstinner

@vstinner

Mariatta

@Mariatta

Thanks @vstinner.
I personally prefer the word "experts" over "wizards". It's not a jargon or idiom, something the general public will understand :)
Thanks for changing it.

@serhiy-storchaka

This option is not for the general public.

If change the wording, you should change also the man page. I think it is better to open a separate issue for changing wizards to experts. This change is not related to this issue.

@vstinner

@vstinner

This option is not for the general public.

"expert only" doesn't mean "generic public", it's more the opposite.

If change the wording, you should change also the man page.

Done.

@vstinner

Thank you for your reviews @serhiy-storchaka and @Mariatta! I'm not confortable when I have to modify the documentation, so your feedback made me more confident in my change :-)

ericsnowcurrently

.. note::
The following functions **should not be called** before

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind putting these into a list?

* :c:func:`PyImport_AppendInittab`
* :c:func:`PyImport_ExtendInittab`
* :c:func:`PyInitFrozenExtensions`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is PyInitFrozenExtensions() in this list?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a request from @serhiy-storchaka. I don't know the rationale :-)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is called before Py_Initialize() in Python/frozenmain.c on Windows.

* :c:func:`PyInitFrozenExtensions`
* :c:func:`PyMem_SetAllocator`
* :c:func:`PyMem_SetupDebugHooks`
* :c:func:`PyObject_SetArenaAllocator`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we promising that PyObject_SetArenaAllocator() (and PyObject_GetArenaAllocator) is available before initialization? What do embedders need it for before init? Can we hold off adding it to this list until necessary?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same reason as for PyMem_SetAllocators: the embedding application needs to be able to specify how memory allocation during Py_Initialize() itself should be handled.

* :c:func:`PyImport_AppendInittab`
* :c:func:`PyImport_ExtendInittab`
* :c:func:`PyInitFrozenExtensions`
* :c:func:`PyMem_SetAllocator`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much of the PyMem_*() API do we really need to promise to embedders? AFAICS, we've only required them to use PyMem_RawFree() before initialization, due to Py_DecodeLocale(). The less we promise pre-init, the better.

Ideally we would provide equivalent pre-init functions, along with pre-init allocator state, that embedders could use if they needed.
If we do end up needing more, could we instead provide equivalent pre-init functions that take explicit allocator state that

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting the allocators pre-initialization is required, as that allows the embedding application to control exactly how CPython allocates memory during startup.

.. _global-conf-vars:
Global configuration variables

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some ways, I'd rather we not advertise these flags as global variables, and particularly not as stable pre-init state. If the goal is PEP 432 then further publicizing this state helps to entrench and persist the API from which we're trying to move away.

On the other hand, PEP 432 and its API is not accepted (yet) and the status quo suggests that the documentation you've added here for existing state is useful.

From my biased position, I'd favor leaving this part out for now, but only if we're serious about PEP 432. :)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For PEP 432, I want us to consider "Don't break backwards compatibility with the current embedding API" as a hard constraint. Part of that process is going to be reverse engineering what the current embedding API actually is in practice, since we don't have it fully documented, and we definitely don't have it fully tested.

So in a post-PEP-432 world, we'll end up with a situation where we have a "Legacy Embedding API" that we promise not to break (at least, not prior to Python 4.0 at the very earliest), and then a more fine-grained configuration-struct-based one.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some ways, I'd rather we not advertise these flags as global variables, and particularly not as stable pre-init state.

What do you mean by "not stable"? The inital state of these flags is well defined.

These flags are already actively used in the wild, it's lilely to only way to tune some Python options.

Not documenting them only makes Python harder to embed, it doesn't prevent users to use it.

Moreover, variables are exported as public symbols (prefixed with "Py", not "_Py").

@ericsnowcurrently

Sorry I didn't get my review out sooner.

ncoghlan

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @vstinner, this looks good to me, and defines a solid set of backwards compatibility constraints for the PEP 432 refactoring work.

* :c:func:`PyImport_AppendInittab`
* :c:func:`PyImport_ExtendInittab`
* :c:func:`PyInitFrozenExtensions`
* :c:func:`PyMem_SetAllocator`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting the allocators pre-initialization is required, as that allows the embedding application to control exactly how CPython allocates memory during startup.

* :c:func:`PyInitFrozenExtensions`
* :c:func:`PyMem_SetAllocator`
* :c:func:`PyMem_SetupDebugHooks`
* :c:func:`PyObject_SetArenaAllocator`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same reason as for PyMem_SetAllocators: the embedding application needs to be able to specify how memory allocation during Py_Initialize() itself should be handled.

@vstinner

@ericsnowcurrently: I created the PR #4621 to change the formatting of the note. I also replied to your comments. Do you think that someone should be changed in the new PR #4621? Maybe comment there directly.