Message 78628 - Python tracker (original) (raw)
I haven't read any papers. Having a jump table in itself isn't special (the compiler does exactly that when compiling the switch() statement). What's special is that a dedicated indirect jump instruction at the end of each opcode helps the CPU make a separate prediction for which opcode follows the other one, which is not possible with a switch statement where the jump instruction is shared by all opcodes. I believe that's where most of the speedup comes from.
If you read the patch it will probably be easy to understand.
You are right. It's easier to understand after I've learned how the opcode_targets table is working. Previously I didn't know that one can store the address of a label in an array. Before I got it I wondered where the pointers were defined. Is this a special GCC feature? I haven't seen it before.
Don't know! Your experiments are welcome. My patch is far simpler to integrate though (it's small, introduces very few changes and does not break any existing tests).
Yes, your patch is much smaller, less intrusive and easier to understand with a little background in CS.