Goals and Architecture Overview — RPython Documentation (original) (raw)

PyPy

High Level Goals

Traditionally, language interpreters are written in a target platform language such as C/Posix, Java or C#. Each implementation provides a fundamental mapping between application source code and the target environment. One of the goals of the “all-encompassing” environments, such as the .NET framework and to some extent the Java virtual machine, is to provide standardized and higher level functionalities in order to support language implementers for writing language implementations.

PyPy is experimenting with a more ambitious approach. We are using a subset of the high-level language Python, called RPython Language, in which we write languages as simple interpreters with few references to and dependencies on lower level details. The RPython toolchain produces a concrete virtual machine for the platform of our choice by inserting appropriate lower level aspects. The result can be customized by selecting other feature and platform configurations.

Our goal is to provide a possible solution to the problem of language implementers: having to write l * o * p interpreters for ldynamic languages and p platforms with o crucial design decisions. PyPy aims at making it possible to change each of these variables independently such that:

By contrast, a standardized target environment - say .NET - enforces p=1 as far as it’s concerned. This helps making o a bit smaller by providing a higher-level base to build upon. Still, we believe that enforcing the use of one common environment is not necessary. PyPy’s goal is to give weight to this claim - at least as far as language implementation is concerned - showing an approach to the l * o * p problem that does not rely on standardization.

The most ambitious part of this goal is to generate Just-In-Time Compilers in a language-independent way, instead of only translating the source interpreter into an interpreter for the target platform. This is an area of language implementation that is commonly considered very challenging because of the involved complexity.

Architecture

The job of the RPython toolchain is to translate RPython Language programs into an efficient version of that program for one of the various target platforms, generally one that is considerably lower-level than Python.

The approach we have taken is to reduce the level of abstraction of the source RPython program in several steps, from the high level down to the level of the target platform, whatever that may be. Currently we support two broad flavours of target platforms: the ones that assume a C-like memory model with structures and pointers, and the ones that assume an object-oriented model with classes, instances and methods (as, for example, the Java and .NET virtual machines do).

The RPython toolchain never sees the RPython source code or syntax trees, but rather starts with the code objects that define the behaviour of the function objects one gives it as input. It can be considered as “freezing” a pre-imported RPython program into an executable form suitable for the target platform.

The steps of the translation process can be summarized as follows:

This process is described in much more detail in the document about the RPython toolchain and in the paper Compiling dynamic language implementations.

Further reading