Message 78660 - Python tracker (original) (raw)

Hello,

You may want to check out in which a similar patch was provided, but failed to deliver the desired results.

I didn't get the advertised ~15% speed-up, but only 4% on my Intel Core2 laptop and 8% on my AMD Athlon64 X2 desktop. I attached the benchmark results.

Thanks. The machine I got the 15% speedup on is in 64-bit mode with gcc 4.3.2.

If you want to investigate, you can output the assembler code for ceval.c; the command-line should be something like:

gcc -pthread -c -fno-strict-aliasing -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I. -IInclude -I./Include -DPy_BUILD_CORE -S -dA Python/ceval.c

and then count the number of indirect jump instructions in ceval.c:

grep -E "jmp[[:space:]]*%" ceval.s

There should be 85 to 90 of them, roughly. If there are many less, then the compiler has tried to optimize them by "sharing" them.

First, you should rename opcode_targets.c to opcode_targets.h. This will make it explicit that the file is not compiled, but just included.

Ok.

Also, the macro USE_THREADED_CODE should be renamed to something else; the word "thread" is too tightly associated with multi-threading. Furthermore, threaded code simply refers to code consisting only of function calls. Maybe, USE_COMPUTED_GOTO or USE_DIRECT_DISPATCH would be better.

Ok.

Finally, why do you disable your patch when DYNAMIC_EXECUTION_PROFILE or LLTRACE is enabled? I tested your patch with both enabled and I didn't see any test failures.

Because otherwise the measurements these options are meant to do would be meaningless.

By the way, SUNCC also supports GCC's syntax for labels as values

I don't have a Sun machine to test, so I'll leave to someone else to check and enable if they want to.