GitHub - jsbueno/extrainterpreters: Utilities for using Python's PEP 554 subinterpreters (original) (raw)

extrainterpreters

Utilities to make use of new "interpreters" module in Python 3.12

Status: beta

Usage: Works with cPython >= 3.12

Just import "extrainterpreters" and use the Interpreter class as a wrapper to the subinterpreter represented by a single integer handle created by the new interpreters module:

import extrainterpreters as ET

import time

def my_function(): time.sleep(1) print("running in other interpreter") return 42

interp = ET.Interpreter(target=my_function) interp.start() print("running in main interpreter") time.sleep(1.5) print(f"computation result: {interp.result()}")

history

PEP 554 Describes a Python interface to make use of Sub-interpreters- a long living feature of cPython, but only available through C code embedding Python. While approved, the PEP is not in its final form, and as of Python 3.13 the interpreters module suggested there is made available as _interpreters. (And before that, in Python 3.13, it is available as _xxsubinterpreters.

With the implementation of PEP 684 before Python 3.12, using subinterpreters became a lot more interesting, because each subinterpreter now has an independent GIL. This means that different interpreters (running in different threads) can actually execute Python code in parallel in multiple CPU cores.

"extrainterpreters" offer a nice API to make use of this feature before PEP 554 becomes final, and hopefully, will keep offering value as a nice API wrapper when it is final. Enjoy!

forms of use

with an extrainterpreter.Interpreter instance, one can call:inter.run(func_name, *args, **kwargs}) to run code there in the same thread: the return value is just returned as long as it is pickleable. The inter.run_in_thread with the same signature, will start a fresh thead and call the target function there - the Interpreter instance then should be pooled in its "done()" method, and once done, the return value will be available by calling "result".

Also working is a threading like interface:

from extrainterpreters import Interpreter

...

    interp = Interpreter(target=myfunc, args=(...,))
    interp.start()
    ...
    interp.join()

    # bonus: we have return values!

    print(interp.result()

It will even work for methods defined on the REPL, so, just give it a try.

There is a new class in the works which will be able to run code in the sub-interpreter without depending of the PEP554 "run_string" (once the setup is done). All processing in the sub- interpreter will automatically take place in another thread, and the roadmap is towards having a Future object and an InterpreterPoolExecutor compatible with the existing executors in concurrent.Futures

A lot of things here are subject to change. For now, all data exchange takes place in a custom memory buffer with 10MB (it is a bytearray on the parent interpreter, and a memoryview for that on the target)

Data is passed to and back the interpreter using pickle - so, no black magic trying to "kill an object here" and "make it be born there" (not even if I intended to, as ctypes is not currently importable on sub-interpreters).

The good news is that pickle will take care of importing whatever is needed on the other side, so that a function can run.

API and Usage:

In beta stage the suggestion is to use this as one would threading.Thread - see the example above - and do not rely on the provided Queue class (seriously it is broken right now), until some notice about it is given.

Roadmap

first the basics:

we should get Lock, Rlock, Queue and some of the concurrency primitives
as existing for threads, async and subprocessing working - at that point
this should be production quality

second, the bells and whistles

I plan to get this compatible with concurrent.futures.Executor, and have an easy way to schedule a subinterpreter function execution as an async task.

Also, I should come up with a Queue object to pass data back and forth. As for the data passing mechanism: we will use pickle, no doubt.

Architecture

The initial implementation used a pre-allocated mmaped file to send/get data from other interpreters, until I found out the mmap call in the sub-interpreter would build a separate mmap object in a different memory region, and a lot of data copying was taking place.

Currently, two C methods retrieve the address of an object implementing the buffer interface, and create a "stand alone" memoryview instance from the address alone. This address is passed to the subinterpreter as a string during the setup stage,