[Python-Dev] New project : Spyke python-to-C compiler (original) (raw)

Rahul Garg garg1 at ualberta.ca
Mon Apr 7 02:48:50 CEST 2008


Note this message has been posted to numpy-discussion and python-dev. Sorry for the multiple posting but I thought both python devs and numpy users will be interested. If you believe your list should not
receive this email, let me know. Also I just wanted to introduce
myself since I may ask doubts about Python and Numpy internals from
time to time :)

Hi.

I am a student at Univ of Alberta doing my masters in computing science. I am writing a Python-to-C compiler as one part of my thesis. The compiler, named Spyke, will be made available in a couple of weeks and is geared towards scientific applications and will therefore focus
mostly on needs of scientific app developers.

What is Spyke? In many performance critical projects, it is often necessary to rewrite parts of the application in C. However writing C wrappers can be time consuming. Spyke offers an alternative approach. You add
annotations to your Python code as strings. These strings are
discarded by the Python interpreter but these are interpreted as types by Spyke compiler to convert to C.

Example :

"int -> int" def f(x): return 2*x

In this case the Spyke compiler will consider the string "int -> int" as a decalration that the function accepts int as parameter and returns int. Spyke will then generate a C function and a wrapper
function. This idea is directly copied from PLW (Python Language Wrapper) project. Once Python3k arrives, much of these declarations will be moved to function annotations and class decorators.

This way you can do all your development and debugging interactively using the standard Python interpreter. When you need to compile to C, you just add type annotations to places that you want to convert and invoke spyke on the annotated module. This is different from Pyrex because Pyrex does not accept Python code. With Spyke, your code is 100% pure python.

Spyke has basic support for functions and classes. Spyke can do very
basic type inference for local variables in function bodies. Spyke
also has partial support for homogenous lists and dictionaries and fixed length tuples. One big advantage of Spyke is that it understands at least part of
numpy. Numpy arrays are treated as fundamental types and Spyke knows
what C code to generate for slicing/indexing of numpy arrays etc. This should help a lot in scientific applications. Note that Spyke can handle only a subset of Python. Exceptions, iterators, generators, runtime code generation of any kind etc is not handled. Nested functions will be
added soon. I will definitely add some of these missing features based
on what is actually required for real world Python codes. Currently if
Spyke does not understand a function, it just leaves it as Python
code. Classes can be handled but special methods are not currently supported. The support of classes is a little brittle because I am trying to resolve some issues b/w old and new style of classes.

Where is Spyke? Spyke will be available as a binary only release in a couple of weeks. I intend to make it open source after a few months. Spyke is written in Python and Java and should be platform independant. I do intend to make the source open in a few months. Right now its undergoing very rapid development and has negligible amounts of
documentation so the source code right now is pretty useless to anyone
else anyway.

I need help: However I need a bit of help. I am having a couple of problems : a) I am finding it hard to get pure Python+NumPy testing codes. I need more codes to test the compiler. Developing a compiler without a
test-suite is kind of useless. If you have some pure Python codes
which need better performance, please contact me. I guarantee that
your codes will not be released to public without your permission but
might be referenced in academic publications. I can also make the
compiler available to you hopefully after 10th of April. Its kind of
unstable currently. I will also need your help in annotating the
provided testing codes since I probably wont know what your
application is doing.

b) Libraries which interface with C/C++ : Many codes in SciPy for instance have mixed language codes. Part of the code is written in C/C++. Spyke only knows how to annotated Python codes. For C/C++ libraries wrapped into Python modules, Spyke will therefore need to know at least 2 things : i) The mapping of a C function name/struct etc to Python ii) The type information of the said C function.

There are many many ways that people interact with C code. People either write wrappers manually, or use autogenerated wrappers using SWIG or SIP Boost.Python etc., use Pyrex or Cython while some people use ctypes. I dont have the time or resources to support these multitude of methods. I considered trying to parse the C code
implementing wrappers but its "non-trivial" to put it mildly. Parsing
only SWIG generated code is another possibility but its still hard.
Another approach that I am seriously considering is to support a
subset of ctypes (with additional restriction) instead. But my
question is : Is ctypes good enough for most of you? Ctypes cannot
interface with C++ code but its pure Python. However I have not seen
too many instances of people using ctypes.

c) Strings as type declarations : Do you think I should use decorators instead at least for function type declarations?

thanks for patiently reading this, comments and inquiries sought. rahul



More information about the Python-Dev mailing list