Issue 9155: Reserve COMPARE_OP for rich comparisons (original) (raw)

Created on 2010-07-04 19:53 by serprex, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
cmpoppatch.diff serprex,2010-07-04 19:53
cmpoppatch2.diff serprex,2010-07-04 20:26 Without whitespace amendments
cmpoprotdup.diff serprex,2010-07-05 01:02
cmpoprotdup2.diff serprex,2010-07-05 01:26
cmpoprotdupalluny.diff serprex,2010-07-05 14:07
cmpoprotdupalluny2.diff serprex,2010-07-05 22:47
cmpoprotdupalluny3.diff serprex,2010-07-08 18:10
pybench.txt pitrou,2010-07-08 18:31
Messages (13)
msg109259 - (view) Author: Demur Rumed (serprex) Date: 2010-07-04 19:53
Currently, COMPARE_OP is burdened by a needless, and unorthogonal, extra layer of indirection. I've modified it to only handle the rich comparison case, moving the other five cases to BINARY_IN, BINARY_NOT_IN, BINARY_IS, BINARY_IS_NOT, and BINARY_EXC_MATCH To consider is inlining the POP_JUMP_IF_FALSE POP_TOP which always follow BINARY_EXC_MATCH
msg109261 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-07-04 20:09
Any chance you could regenerate the patch without all the unnecessary whitespace changes? It's kinda hard to read right now.
msg109279 - (view) Author: Demur Rumed (serprex) Date: 2010-07-05 01:02
I've attached the original patch without whitespace, and am also including modifications to this portion of the interpreter which remove ROT_FOUR, DUP_TOPX while adding ROT_THREE_TWO, DUP_TOP_TWO, DUP_ROT_THREE. I've seen a 5% speed increase with timeit.Timer("a[0]+=0","a=[0]") Also modified the optimizer to reject code with a size over 32767 instead of 32760. That's most likely a useless modification which may be preferred ignored
msg109282 - (view) Author: Demur Rumed (serprex) Date: 2010-07-05 01:26
Fixed regression in ROT_THREE_TWO and a missed DUP_TOP2 -> DUP_TOP_TWO. Now there are only the odd segfaults
msg109319 - (view) Author: Demur Rumed (serprex) Date: 2010-07-05 14:07
I've fixed the segfaulting in IN and NOT_IN, though the code requires cleaning due to the obtusely cargo culted nature of those opcodes I'm also including an incrementation which completely removes COMPARE_OP, using the opcode to determine the comparison type. As a result, I found it safe to remove the PyCmp_ enum, though cpickle seems to disagree. It's also lacking performance wise
msg109360 - (view) Author: Demur Rumed (serprex) Date: 2010-07-05 22:47
It seems the lack of benefits from replacing COMPARE_OP with a set of aliased bytecodes was repairable through TARGET_WITH_IMPL. Also fixed was the lack of amendments to dis
msg109514 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2010-07-08 05:56
Be sure to consider the impact on the peephole optimizer. Also consider that adding volume inside the ceval.c loop may have unintended slowing effects on various compilers -- supposedly, this has been a serious issue in that past and there has been concern about growing the body of the loop. I presume that the thrust of the efforts is directed at eliminating the an unpredictable switch branch inside the COMPARE_OP opcode. If so, there's not much fat to be shed here. The cost of the opcode switch is small relative the time it takes to dispatch to an object's rich comparison methods.
msg109538 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2010-07-08 13:14
I would like any such change to show what it improves through benchmarks.
msg109553 - (view) Author: Demur Rumed (serprex) Date: 2010-07-08 16:25
The seperation of COMPARE_OP into multiple opcodes shouldn't affect cache size since the opcodes are aliased. Spreading out the switch statement shouldn't cause issue from flattening, since GCC would inline the single use of cmp_outcome. Thus the only bloat is the wrapper overhead, which I believe the C code is relatively large with respect to the macros. As for DUP_TOP_TWO, that's smaller than DUP_TOPX It may be suitable to benchmark on other systems besides my own due to the subversive nature of caches; I've seen improvement in any code that uses comparisons. To consider, however, is that ceval.o dropped from 268592B to 267448B. This is reflected in that the interpreter's executable dropped from 6721842B to 6719866B Raymond, what do you mean with respect to the peepholer? I believe my changes there have shown it to be a simplification of the code
msg109557 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-07-08 17:33
The patch fails compiling with --with-computed-gotos: Python/ceval.c: In function ‘PyEval_EvalFrameEx’: Python/opcode_targets.h:101: erreur: label ‘TARGET_DUP_TOPX’ used but not defined make: *** [Python/ceval.o] Erreur 1
msg109561 - (view) Author: Demur Rumed (serprex) Date: 2010-07-08 18:10
Replaced TARGET_DUP_TOPX with _unknown_opcode I recompiled with --with-computed-gotos, and the results remain promising, though the interpreter instead grows from 6756023B to 6765167B
msg109564 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-07-08 18:31
Here, your patch actually show a slowdown on pybench with computed gotos. Results attached.
msg115345 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2010-09-02 00:23
I'm marking this one as rejected. The timings have shown mixed results (some favorable, some not). In general, there is a bias against expanding the number of opcodes because 1) there aren't that many codes available, 2) it grows the size of the switch-case (which has been problematic from some compilers on various systems), and because it complicates other optimization efforts such as the peephole optimizer or in other tools that depend on bytecode (such as the Unladen Swallow LLVM project, Psyco, or ByteCodeHacks). Also, the potential improvement is very small because it removes only one indirection out of long chain of events involved in a comparison. Most of the time is consumed in abstract dispatch to a concrete comparison method and it that method itself.
History
Date User Action Args
2022-04-11 14:57:03 admin set github: 53401
2010-09-02 00:23:45 rhettinger set status: open -> closedresolution: rejectedmessages: +
2010-09-01 21:37:13 stutzbach set nosy: + stutzbach
2010-07-08 18:31:58 pitrou set files: + pybench.txtmessages: +
2010-07-08 18:10:14 serprex set files: + cmpoprotdupalluny3.diffmessages: +
2010-07-08 17:33:05 pitrou set messages: +
2010-07-08 16:25:40 serprex set messages: +
2010-07-08 13:24:32 pitrou set nosy: + pitrou, jyasskin
2010-07-08 13:14:51 benjamin.peterson set nosy: + benjamin.petersonmessages: +
2010-07-08 05:56:16 rhettinger set messages: +
2010-07-05 22:47:45 serprex set files: + cmpoprotdupalluny2.diffmessages: + versions: - Python 3.3
2010-07-05 14:07:39 serprex set files: + cmpoprotdupalluny.diffmessages: +
2010-07-05 11:50:09 eric.araujo set messages: -
2010-07-05 02:20:12 rhettinger set assignee: rhettingernosy: + rhettinger
2010-07-05 01:26:57 serprex set files: + cmpoprotdup2.diffmessages: +
2010-07-05 01:03:37 serprex set files: - cmpoprotdup.diff
2010-07-05 01:03:09 serprex set files: + cmpoprotdup.diffmessages: +
2010-07-05 01:03:00 serprex set files: + cmpoprotdup.diffmessages: +
2010-07-04 20:26:06 serprex set files: + cmpoppatch2.diff
2010-07-04 20:13:38 belopolsky set nosy: + belopolsky
2010-07-04 20:09:55 mark.dickinson set nosy: + mark.dickinsonmessages: +
2010-07-04 19:54:18 serprex set title: Reserve COMPARE_OP for RichComparisons -> Reserve COMPARE_OP for rich comparisons
2010-07-04 19:53:59 serprex create