Issue 9866: Inconsistencies in tracing list comprehensions (original) (raw)

Attached test script, tracetest.py, prints disassembly followed by a trace of the following function:

1 def f(): 2 return [i 3 for i 4 in range(2)]

With default configuration, the output is

2 0 LOAD_CONST 1 (<code object at 0x100484e00, file "tracetest.py", line 2>) 3 MAKE_FUNCTION 0

4 6 LOAD_GLOBAL 0 (range) 9 LOAD_CONST 2 (2) 12 CALL_FUNCTION 1 15 GET_ITER
16 CALL_FUNCTION 1 19 RETURN_VALUE
listcomp: 2 0 BUILD_LIST 0 3 LOAD_FAST 0 (.0) >> 6 FOR_ITER 12 (to 21)

3 9 STORE_FAST 1 (i) 12 LOAD_FAST 1 (i) 15 LIST_APPEND 2 18 JUMP_ABSOLUTE 6 >> 21 RETURN_VALUE
['2 0 LOAD_CONST', '4 6 LOAD_GLOBAL', '2 0 BUILD_LIST', '3 9 STORE_FAST', '2 6 FOR_ITER', '3 9 STORE_FAST', '2 6 FOR_ITER']

but with configuration using --without-computed-gotos option, the disassembly is the same, but the trace is different:

['2 0 LOAD_CONST', '4 6 LOAD_GLOBAL', '2 0 BUILD_LIST', '2 6 FOR_ITER', '2 6 FOR_ITER']

This behavior changed between 3.1 and 3.2 (likely in r74132), but it is inconsistent in both versions. Since r74132 changes were not backported to 3.1, I am classifying this as 3.2 only even though the problem is present in 3.1 as well.

See also issues #6042 and #9315.

On Wed, Sep 15, 2010 at 5:33 PM, Antoine Pitrou <report@bugs.python.org> wrote: ..

As I said in #9315, I think this kind of thing (bytecode traces) is an implementation detail; the changes in results shouldn't be regarded as breaking compatibility.

In r74132, an attempt was made to rationalize and document line tracing. While it is an implementation detail, I don't think people expect it to be platform dependent in CPython implementation. "With computed gotos" was supposed to be a pure optimization. I find it very surprising that it changes tracing behavior.

The only problem I could see would be if a whole line of code would be "forgotten".

This is exactly what my example demonstrates.

I have found the root cause of these differences. The trace function is not called when the opcode is successfully predicted. When computed gotos are enabled, opcode prediction is disabled as explained in the following comment in ceval.c:

Opcode prediction is disabled with threaded code, since the latter allows                                                                                                                         
the CPU to record separate branch prediction information for each                                                                                                                                 
opcode.

Note that this issue is similar to #884022 which was resolved by disabling opcode prediction in dynamic profile builds.

Given that opcode prediction if off by default, I don't see much of the reason to try to improve tracing of predicted opcodes.