[Python-Dev] Python 3 optimizations... (original) (raw)

stefan brunthaler stefan at brunthaler.net
Fri Jul 23 20:26:27 CEST 2010


How do you generate the specialized opcode implementations? I have a small code generator written in Python that uses Mako templates to generate C files that can be included in the main interpreter. It is a data driven approach that uses type information gathered by gdb and check whether given types implement for instance a nb_add method.

Presumably that is done ahead of time, or you'd have to use a JIT, which is what you're avoiding. Yes, and yes: I execute the code generator before compiling the Python interpreter, and I am interested in purely interpretative optimization techniques.

I'm guessing from your comments below about cross-module inlining that you generate a separate .c file with the specialized opcode bodies and then call through to them via a table of function pointers indexed by opcode, but I could be totally wrong.  :) No, dead on ;) Probably a small example from the top of my head illustrates what is going on:

TARGET(FLOAT_ADD): w= POP(); v= TOP(); x= PyFloat_Type.tp_as_number->nb_add(v, w); SET_TOP(x); if (x != NULL) FAST_DISPATCH(); break;

And I extend the standard indirect threaded code dispatch table to support the FLOAT_ADD operation.

There are a variety of solutions to getting cross-module inlining these days.  Clang+LLVM support link-time optimization (LTO) via a plugin for gold.  GCC has LTO and LIPO as well. A PhD colleague from our institute pointed the gold stuff out to me yesterday, I am going to check out if any of these solutions would work. A deeper problem here is that the heuristics of the compilers are ill-suited to the needs of compiling an interpreter dispatch routine -- I will investigate this further in future research.

This would be interesting.  We have (obviously) have similar instrumentation in unladen swallow to gather type feedback.  We talked with Craig Citro about finding a way to feed that back to Cython for exactly this reason, but we haven't really pursued it. Ok; I think it would actually be fairly easy to use the type information gathered at runtime by the quickening approach. Several auxiliary functions for dealing with these types could be generated by my code generator as well. It is probably worth looking into this, though my current top-priority is my PhD research, so I cannot promise to being able to allocate vast amounts of time for such endeavours.

Best, --stefan



More information about the Python-Dev mailing list