[Python-Dev] use co_flags to identify instruction set (original) (raw)
Skip Montanaro skip@pobox.com (Skip Montanaro)
Tue, 7 Aug 2001 23:36:06 -0500
- Previous message: [Python-Dev] Using patch mgr for peer review - sensible or chicken?
- Next message: [Python-Dev] Windows build broken, Unix dweeb should fix
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
All this talk of common backends for Python, Perl, and Ruby goaded me into revisiting the Rattlesnake stuff I laid aside several years ago. Assuming I ever get anywhere with it, it would be nice if code objects could distinguish instruction sets based on a flag in the PyCodeObject struct. The co_flags field seems to have some room and be more-or-less the right place for this stuff. I propose that it be changed from signed to unsigned int and that three bits be reserved to identify an instruction set. Eight possible instruction sets might seem a bit much, but I'd rather have a little room for growth. If we count the current instruction set, the Rattlesnake (register) stuff I've been playing with, and Armin Rigo's Psyco VM as distinct instruction sets, we've already used three of the possible eight. I'm still fiddling with a 1.5.2 code base and am currently only using one bit in co_flags to distinguish the instruction set, but I do use it to indirect through a two-element array of function pointers and call the appropriate variant of eval_code2 (now eval_frame).
Just to whet peoples' appetites a bit...
I'm struggling to get conditional opcodes working at the moment, but have had pretty good success eliding unnecessary loads and stores in straight blocks of code. Given this trivial function:
def f(a):
b = a + 4
c = a + b
return c
The Rattlesnake optimizer can convert it from
>> 0 LOAD_FAST 0 (a)
3 LOAD_CONST 1 (4)
6 BINARY_ADD
7 STORE_FAST 1 (b)
10 LOAD_FAST 0 (a)
13 LOAD_FAST 1 (b)
16 BINARY_ADD
17 STORE_FAST 2 (c)
20 LOAD_FAST 2 (c)
23 RETURN_VALUE
to
>> 0 (0336) LOAD_CONST_REG %r4, 4
3 (0077) BINARY_ADD_REG b, a, %r4
7 (0077) BINARY_ADD_REG c, a, b
11 (0075) RETURN_VALUE_REG c
Needless to say, I expect it to run a bit faster than the original code.
Rattlesnake takes advantage of a property of frame objects that Tim pointed out to me a long time ago, namely that the frame's locals and its temporary stack space are contiguous and can just be treated as a single register file. In fact, the code above can just as easily be written as
>> 0 (0336) LOAD_CONST_REG %r3, 4
3 (0077) BINARY_ADD_REG %r1, %r0, %r3
7 (0077) BINARY_ADD_REG %r2, %r0, %r1
11 (0075) RETURN_VALUE_REG %r2
Skip
- Previous message: [Python-Dev] Using patch mgr for peer review - sensible or chicken?
- Next message: [Python-Dev] Windows build broken, Unix dweeb should fix
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]