[Python-Dev] RE: [Python-checkins] python/dist/src/Python ceval.c, 2.383, 2.384 (original) (raw)

Raymond Hettinger python at rcn.com
Sat Mar 20 16:14:58 EST 2004


Modified Files: ceval.c Log Message: A 2% speed improvement with gcc on low-endian machines. My guess is that this new pattern for NEXTARG() is detected and optimized as a single (*short) loading.

It is possible to verify that guess by looking at the generated assembler.

There are other possible reasons. One is that the negative array offsets don't compile well into a native addressing mode of base+offset*wordsize. I have seen and proven that is the case in other parts of the code base. The other reason for the speedup is that pre-incrementing the pointer prevented the lookup from being done in parallel (i.e. a sequential dependency was present).

If the latter reason is a true cause, then part of the checkin is counter-productive. The change to PREDICTED_WITH_ARG introduces a pre-increment in addition to the post-increment. Please run another timing with and without the change to PREDICTED_WITH_ARG. I suspect the old way ran faster. Also, the old way will always be faster on big-endian machines and would be faster on machines with less sophisticated compilers (and possibly slower on MSVC++ if it doesn't automatically generate a load short). Another consideration is that loading a short may perform much differently on other architectures because even alignment only occurs half of the time.

Summary: +1 on the changes to NEXT_ARG and EXTENDED_ARG; -1 on the change to PREDICTED_WITH_ARG.

Raymond Hettinger

#define PREDICTED(op) PRED##op: nextinstr++ ! #define PREDICTEDWITHARG(op) PRED##op: oparg = (next_instr[2]<<8) _+ _ ! nextinstr[1]; nextinstr += 3

/* Stack manipulation macros */ --- 660,664 ---- #define PREDICTED(op) PRED##op: nextinstr++ ! #define PREDICTEDWITHARG(op) PRED##op: nextinstr++; oparg = OPARG(); nextinstr += OPARGSIZE



More information about the Python-Dev mailing list