[Python-Dev] PyCharm debugger became 40x faster on Python 3.6 thanks to PEP 523 (original) (raw)

Xavier de Gaye xdegaye at gmail.com
Mon Mar 27 15:48:08 EDT 2017

Previous message (by thread): [Python-Dev] PyCharm debugger became 40x faster on Python 3.6 thanks to PEP 523
Next message (by thread): [Python-Dev] PyCharm debugger became 40x faster on Python 3.6 thanks to PEP 523
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 03/25/2017 08:57 PM, Terry Reedy wrote:

On 3/25/2017 8:56 AM, Serhiy Storchaka wrote:

On 25.03.17 12:04, Victor Stinner wrote:

https://blog.jetbrains.com/pycharm/2017/03/inside-the-debugger-interview-with-elizaveta-shashkova/

"What changed in Python 3.6 to allow this?

The new frame evaluation API was introduced to CPython in PEP 523 and it allows to specify a per-interpreter function pointer to handle the evaluation of frames."

Nice!

Awesome! Any chance that pdb can utilize similar technique? Or this doesn't make sense for pdb?

According to the bdb.Bdb docstring, pdb implements a command-line user interface on top of bdb, while bdb.Bdb "takes care of the details of the trace facility". idlelib.debugger similarly implements a GUI user interface on top of bdb. I am sure that there are other debuggers that build directly or indirectly (via pdb) on bdb. So the question is whether bdb can be enhanced or augmented with a C-coded _bdb or other new module.

As I understand it, sys.settrace results in an execution break and function call at each point in the bytecode corresponding to the beginning of a (logical?) line. This add much overhead. In return, a trace-based debugger allows one to flexibly control stop and go execution either with preset breakpoints* or with interactive commands: step (one line), step into (a function frame), step over (a function frame), or go to next breakpoint. The last is implemented by the debugger automatically stepping at each break call unless the line is in the existing breakpoint list.

Breakpoints can be defined either in an associated editor or with breakpoint commands in the debugger when execution is stopped.

PEP 523 envisioned an alternate non-trace implementation of 'go to next breakpoint' by a debugger going "as far as to dynamically rewrite bytecode prior to execution to inject e.g. breakpoints in the bytecode." https://www.python.org/dev/peps/pep-0523/#debugging

A debugger doing this could either eliminate the other 'go' commands (easiest) or implement them by either setting temporary breakpoints or temporarily turning tracing on.

I presume it should be possible to make bdb.Bdb use bytecode breakpoints or add a new class with a similar API. Then any bdb-based debugger to be modified to make the speedup available.

pdb-clone, an extension to pdb, gets about those same performance gains over pdb while still using sys.settrace(). pdb-clone runs at a speed of less than twice the speed of the interpreter when pdb runs at about 80 times the speed of the interpreter. See some performance measurements at https://bitbucket.org/xdegaye/pdb-clone/wiki/Performances.md

Given those results, it is not clear how one would get a boost of a factor 40 by implementing PEP 523 for the pdb debugger as pdb could already be very close to the speed of the interpreter mostly by implementing in a C extension module the bdb.Bdb methods that check whether the debugger should take control.

Setting a trace function with sys.settrace() adds the following incompressible overhead:

15-20 % overhead: computed goto are not used in the ceval loop when tracing is active.
The trace function receives all the PyTrace_LINE events (even when the frame f_trace is NULL :(). The interpreter calls _PyCode_CheckLineNumber() for each of these events and the processing in this function is the one that is costly. An optimization is done in pdb-clone that swaps the trace function with a profiler function whenever possible (i.e. when there is no need to trace the lines of the function) to avoid calling _PyCode_CheckLineNumber() (the profiler still gets PyTrace_C_CALL events but there is not such overhead with these events). The performance gain obtained with this scheme is about 30%.

I think that the main point here is not whether to switch from sys.settrace() to PEP 523, but first to implement the stop_here() bdb.Bdb method in a C extension module.

Xavier

Previous message (by thread): [Python-Dev] PyCharm debugger became 40x faster on Python 3.6 thanks to PEP 523
Next message (by thread): [Python-Dev] PyCharm debugger became 40x faster on Python 3.6 thanks to PEP 523
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list