[Python-Dev] Python VM (original) (raw)

"Martin v. Löwis" martin at v.loewis.de
Tue Jul 22 00:17:56 CEST 2008


Jakob,

This looks fairly correct. A few comments below.

Control Flow ============ The calling sequence is: main() (in python.c) -> PyMain() (main.c) -> PyRunFooFlags() (pythonrun.c) -> runbar() (pythonrun.c) -> PyEvalEvalCode() (ceval.c) -> PyEvalEvalCodeEx() (ceval.c) -> PyEvalEvalFrameEx() (ceval.c).

What this misses is the compiler stuff, i.e. PyParser_ASTFromFoo and PyAST_Compile, which precedes the call to PyEval_ (atleast, no byte code file is available).

Threads ======= PyEvalInitThreads() initializes the GIL (interpreterlock) and sets mainthread to the (threading package dependent) ID of the current thread. Thread switching is done using PyThreadStateSwap(), which sets PyThreadStateCurrent (both defined in pystate.c) and PyThreadStateGET() (an alias for PyThreadStateCurrent) (pystate.h).

True, however, in most cases, this is triggered through Py_BEGIN_ALLOW_THREADS, which passes NULL for the new thread. The actual switching occurs by releasing the GIL, not by ThreadState_Swap.

Actually, Python doesn't dispatch threads at all. It just releases the GIL, giving the operating system permission to wake up a different thread - which the operating system may or may not chose to do. After some time, the original thread will try to reacquire the GIL. Assuming the OS applies fairness, it will not get it back if a different thread was also waiting for it, so our thread will block - and then the OS will dispatch (at latest).

State ===== The global state is recorded in a (per-process?) PyInterpreterState struct and a per-thread PyThreadState struct.

Yes and no. In principle, multiple interpreter states are supported per process (and the current interpreter is identified by thread). However, there are many limitations and quirks in the multiple-interpreter code.

Each execution frame's state is contained in that frame's PyFrameObject (which includes the instruction stream, the environment (globals, locals, builtins, etc.), the value stack and so forth). EvalFrameEx()'s local variables are initialized from this frame object.

Not only. A lot of stuff also lives on the regular C stack, which exists in parallel to the frame object stack (the latter being a spaghetti stack).

The instruction stream looks as follows (c.f. assembleemit() in compile.c):

See also dis.py for the inverse operation.

Basic structure --------------- EvalFrameEx() {

Somewhere you need to merge the thread-switching for threads that have been executing a lot of instructions.

- Objects are transferred onto the value stack by GETITEM()'ing them from consts or names, or by GETLOCAL()'ing them using oparg as an offset into fastlocals (c.f. LOAD* instructions).

Or, of course, as the result from some operation or function call, or load from a global variable, or import, or ...

Regards, Martin



More information about the Python-Dev mailing list