Add a PyThreadState * parameter (almost) everywhere · Issue #132312 · python/cpython (original) (raw)

This is part feature request, part performance issue and part a general appeal for assistance.

We should add a new variant taking a PyThreadState * parameter for most C API and internal functions.

There are two motivations for this, performance and a future, consistent C API.

Performance

The PyThreadState struct is ubiquitous in the VM, it controls stack usage, holds the freelists, holds the current exception, etc, etc.

Consequently many C functions, both API and internal, take a PyThreadState *tstate parameter.
However, for historical reasons, many C functions, both API and internal, do not such a parameter.

This leads to some fairly easy to fix inefficiencies, where spam() and eggs() take a thread state, but ham() does not, then spam() calls ham() which calls eggs(), forcing ham() to load the thread state from thread local storage, in order to pass it to eggs().
Adding a PyThreadState *tstate to ham() avoids need to access thread local storage.

Consistent, portable, future looking C API

In order to support things like tagged integers, we are going to need to a new C API.

It is out of scope to discuss what such an API would look like, but all realistic proposal so far take a "context" parameter to all, or almost all, API functions. Using a PyThreadState * parameter everywhere would make a mechanical transformation from old to new API much simpler.

Unfortunately the C API is large and cannot be changed, so we will need many new functions, like ham_tstate() which replicates ham() but with a thread state parameter.

This work is largely mechanical, and can be done by inexperienced contributors. Hence the appeal for assistance.