Request for PyModule_GetModuleByDef() (original) (raw)

February 26, 2026, 4:53pm 1

In moving my code to use module states I have the need to obtain the module object for the “current” module. All I have in the way of context is the PyModuleSpec structure for the module. The following should work (no error checking)…

PyObject *PyModule_GetModuleByDef(struct PyModuleDef *target_def)
{
    PyObject *modules = PySys_GetAttrString(“modules”);

    Py_ssize_t pos;
    PyObject *name, *module;

    while (PyDict_Next(modules, &pos, &name, &module))
    {
        struct PyModuleDef *module_def;

        PyModule_GetToken(module, &module_def);

        if (module_def == target_def)
        {
            Py_DECREF(modules);
            return module;
        }
    }

    Py_DECREF(modules);

    return NULL;
}

…but is there any chance of this being implemented in the stable ABI (and made more efficient of course)?

philthompson (Phil Thompson) February 26, 2026, 5:10pm 2

…and that should have been PyModule_GetModuleByToken(void *mod_token).

da-woods (Da Woods) February 27, 2026, 8:16am 3

I think the issue is going to be that there’s it’s perfectly possible for a moduledef or token to correspond to multiple modules. (Although practically it’s relatively unusual).

Personally I think it’d be fairly useful to have something like this for the common case where there is a 1-to-1 link (I’ve ended up creating something similar).

But I suspect that’ll be the objection.

encukou (Petr Viktorin) February 27, 2026, 9:22am 4

A 1-to-1 link of what to what?
If you’re sure only 1 module is loaded per PyModuleDef per process, stash the module in a global variable when you create it.
If you’re sure there’s 1 module per PyModuleDef per interpreter, you can find the module in sys.modules by name, and check that PyModule_GetToken matches. Iterating a dictionary only helps if someone imported it under a different name (and won’t help if someone used importlib directly to bypass the sys.modules cache).

But if you want to fully move to isolated module states, there’ll not be a global variable or registry to look things up in. Set things up so that you get the module (or something that points to it) as an argument.

philthompson (Phil Thompson) February 27, 2026, 10:05am 5

That’s possible in most cases but sometimes it isn’t. The implementation
I posted works fine, I was just wondering if there could be a better
implementation that was able to exploit the interpreter internals.

encukou (Petr Viktorin) February 27, 2026, 10:51am 6

When is it not possible? Can we make it possible in that case?

In the interpreter internals, there’s no concept of “current module”. You can get an approximation based on assumptions. “First entry in sys.modules with the given token”, or the other things I mentioned, will works in practice for the usual case, but naming them GetModuleByDef would be misleading.

We probably can add API to improve things, yes. How can we help you get the module passed down the stack, or stored in an appropriate location?

philthompson (Phil Thompson) February 27, 2026, 3:13pm 7

My use case is that I want to monkey patch a PyCFunction defined in
module B into a Python Enum defined in module A. B imports A, A knows
nothing about B.

The context is the generation of bindings for C++ where enums are
wrapped as Python Enums. Library A (wrapped as module A) defines an enum
E. Library B (wrapped as module B) defines a global operator that takes
E as its first argument. This is then implemented in Python as a new
method of E added when module B is imported.

In general terms it doesn’t matter that it’s a Python Enum - it’s a
standard type that uses the standard descriptor.

The problem is that the PyCFunction will not be called with a useful
defining class and hence the only way it can get B’s module object (and
therefore state) is to search sys.modules for a module with a matching
token.

A question…

How can there ever be more than one module in sys.modules with a
particular token?

encukou (Petr Viktorin) February 27, 2026, 4:28pm 8

Are you stuck with PyCFunction? Can the operator be switched to PyCMethod?

Also: I’m not clear on where the module state is needed – does the global operator need A’s state?

sys.modules is just a dict, users can put anything in there.

>>> import sys
>>> import array
>>> sys.modules['myarray'] = array
>>> del sys.modules['array']
>>> import array

>>> sys.modules['myarray'] == sys.modules['array']
False
>>> sys.modules['myarray']
<module 'array' from '.../array.cpython-314-x86_64-linux-gnu.so'>
>>> sys.modules['array']
<module 'array' from '.../array.cpython-314-x86_64-linux-gnu.so'>

(Of course, things can break if you do tricks like this – but the C API shouldn’t.)

philthompson (Phil Thompson) February 27, 2026, 5:53pm 9

I was using PyCFunction generically. The implementation uses
METH_METHOD so the standard descriptor uses PyCMethod to pass the
(not useful) defining class.

The global operator needs B’s state but it can only get A’s state.

da-woods (Da Woods) February 27, 2026, 8:38pm 10

Aside from what Petr said,

PyImport_AppendInittab("name1", init_func);
PyImport_AppendInittab("name2", init_func);

should generate multiple modules from a single def I think. And there’s definitely other ways of doing similar things. Most modules are produced from a file modulename.something.so with a PyInit_modulename function, but they don’t have to be.

philthompson (Phil Thompson) February 28, 2026, 10:44am 11

How about this as a possible solution…

Currently METH_METHOD means that the method has to be a PyCMethod with a defining_class argument added after self. This behaviour is implemented by the getter of the standard descriptor where it passes the defining class as an argument to PyCMethod_New().

What if this was generalised to say that the object passed was (potentially) any Python object and would be the defining class in the normal case where the standard Python descriptor was used. This would allow me to create a customer descriptor type where the getter would pass a module object to the method.

To apply it to my use case, when module B is imported it will create an instance of the custom descriptor which will contain a reference to B’s module object. A reference to the descriptor instance will then be placed in A.E’s type dict. When the descriptor’s getter is called it will call PyCMethod_New() and pass B’s module object as the last argument. The method can then access B’s module state.

The wrinkle is that the defining_class is a PyTypeObject rather than a PyObject but that just reflects the current usage and it’s only the method implementation that cares about the exact type.

This works (with a little casting) and has the advantage of requiring no changes to the core. It would need documentation changes to permit the different behaviour.

philthompson (Phil Thompson) April 5, 2026, 3:45pm 12

Just to tie this off…

I realised that if the current PyCMethod wasn’t doing quite what I want then I can just create something else that does. Doing so has led to quite a few simplifications and generalisations to be made in my code. It also removed my need for PyModule_GetModuleByDef().