[Python-Dev] Who cares about the performance of these opcodes? (original) (raw)
Phillip J. Eby pje at telecommunity.com
Tue Mar 9 08:59:52 EST 2004
- Previous message: [Python-Dev] Who cares about the performance of these opcodes?
- Next message: [Python-Dev] Who cares about the performance of these opcodes?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
At 07:38 AM 3/9/04 -0600, Jeff Epler wrote:
Recently it was proposed to make a new LISTAPPEND opcode, and several contributors pointed out that adding opcodes to Python is always a dicey business because it may hurt performance for obscure reasons, possibly related to the size of that 'switch' statement.
To that end, I notice that there are several opcodes which could easily be converted into function calls. In my code, these are not typically performance-critical opcodes (with approximate ceval.c line count): BUILDCLASS # 9 lines MAKEFUNCTION # 20 lines MAKECLOSURE # 35 lines PRINTEXPR # 21 lines PRINTITEM # 47 lines PRINTITEMTO # 2 lines + fallthrough PRINTNEWLINE # 12 lines PRINTNEWLINETO # 2 lines + fallthrough Instead, each of these would be available in the code objects coconsts when necessary. For example, instead of LOADCONST 1 (<code object g at 0x40165ea0, file_ _"", line 2>) MAKEFUNCTION 0 STOREFAST 0 (g) you'd have LOADCONST 1 (type 'function') LOADCONST 2 (
)
LOADGLOBALS # new opcode, or call globals() LOADCONST 1 ("g") CALLFUNCTION 3 Performance for these specific operations will certainly benchmark worse, but maybe getting rid of something like 150 lines from ceval.c would help other things by magic. The new LOADGLOBALS opcode would be less than 10 lines. No, I don't have a patch. I assume each and every one of these opcodes has a staunch defender who will now come to its aid, and save me the trouble.
If the goal is to remove lines from the switch statement, just move the code of lesser-used opcodes into a C function. There's no need to eliminate the opcodes themselves.
I personally don't think it'll help much, if the goal is to reduce cache misses. After all, the code is all still there. But, it should not do as badly as the approach you're suggesting, because for your case you'll not only have the C-level calls, but also more bytecodes being interpreted.
Hm. Makes me wonder, actually, if a hand-written eval loop in assembly code might not kick some serious butt. Or maybe a bytecode-to-assembly translator, writing loads in-line and using registers as the stack, calling functions where necessary. Ah, if only I were a teenager again, with little need to sleep, and unlimited time to hack... :)
- Previous message: [Python-Dev] Who cares about the performance of these opcodes?
- Next message: [Python-Dev] Who cares about the performance of these opcodes?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]