[Python-Dev] A new JIT compiler for a faster CPython? (original) (raw)

Victor Stinner victor.stinner at gmail.com
Tue Jul 17 20:38:37 CEST 2012


Hi,

I would like to write yet another JIT compiler for CPython. Before writing anything, I would like your opinion because I don't know well other Python compilers. I also want to prepare a possible integration into CPython since the beginning of the project, or at least stay very close to the CPython project (and CPython developers!). I did not understand exactly why Unladen Swallow and psyco projects failed, so please tell me if you think that my project is going to fail too!

== Why? ==

CPython is still the reference implementation, new features are first added to this implementation (ex: PyPy is not supporting Python 3 yet, but there is a project to support Python 3). Some projects still rely on low level properties of CPython, especially its C API (ex: numpy; PyPy has a cpyext module to emulate the CPython C API).

A JIT is the most promising solution to speed up the main evaluation loop: using a JIT, it is possible to compile a function for a specific type on the fly and so enable deeper optimizations.

psyco is no more maintained. It had its own JIT which is complex to maintain. For example, it is hard to port it to a new hardware.

LLVM is fast and the next version will be faster. LLVM has a community, a documentation, a lot of tools and is active.

There are many Python compilers which are very fast, but most of them only support a subset of Python or require to modify the code (ex: specify the type of all parameters and variables). For example, you cannot run Django with Shredskin.

IMO PyPy is complex and hard to maintain. PyPy has a design completly different than CPython and is much faster and has a better memory footprint. I don't expect to be as fast as PyPy, just faster than CPython.

== General idea ==

I don't want to replace CPython. This is an important point. All others Python compilers try to write something completly new, which is an huge task and is a problem to stay compatible with CPython. I would like to reuse as much as possible code of CPython and don't try to fight against the GIL or reference counting, but try to cooperate instead.

I would like to use a JIT to generate specialized functions for a combinaison of arguments types. Specialization enables more optimizations. I would like to use LLVM because LLVM is an active project, has many developers and users, is fast and the next version will be faster! LLVM already supports common optimizations like inlining.

My idea is to emit the same code than ceval.c from the bytecode to be fully compatible with CPython, and then write a JIT to optimize functions for a specific type.

== Roadmap ==

-- Milestone 1: Proof of concept --

The pymothoa project can be used as a base to implement quickly such proof of concept.

-- Milestone 2: Specialized function for the int type --

-- Milestone 3: JIT --

At this step, we can start to benchmark to check if the (JIT) compiler is faster than CPython.

-- Later (unsorted ideas) --

== Other Python VM and compilers ==

-- Fully Python compliant --

-- Subset of Python to C++ --

-- Subset of Python --

-- Language very close to Python --

== See also ==

Victor Stinner



More information about the Python-Dev mailing list