[RFC] lldb-dap refactoring to support async operations and cancellation (original) (raw)
Abstract
Using lldb-dap
today on a large binary (for example, I have been debugging an iOS application built for debugging that is ~650MB + 1.3GB of dSYMs).
If you hit a breakpoint and try to perform some operations like stepping or inspecting a variable is not uncommon for lldb-dap
to block and appear to hang for a bit.
This is primarily due to the sequential nature of how lldb-dap
handles requests.
For example, when you hit a breakpoint the DAP flow is:
event 'stopped'
request 'threads'
request 'stackTrace' { levels: 1, threadId: '<stopped-thread>' }
request 'breakpointLocations' { '<stackTrace[0]>' }
request 'stackTrace' { levels: 19, startFrame: 1, threadId: '<stopped-thread>' }
request 'scopes' { frameId: '<stopped-thread-frame>' }
request 'variables' { variablesReference: 1 }
This chain of events alone on the app I am debugging took ~3s, during which you cannot step or continue the process. If you step, then there is another delay of approx. ~3s as the same data is fetched for the new stopped event.
Additionally if you accidentally hovered over a very large or complex variable (say a UIViewController with many properties) then tried to step your step request would be blocked by the hover request.
To get a sense of how this is affecting lldb-dap, I did some profiling while stopping and stepping through the large app that includes swift, obj-c and c++ code and here is a breakdown of the lldb-dap
time spent:
request_variables
(35% of my trace)request_stackTrace
(27%)request_threads
(16%)request_breakpointLocations
(8%)- (other…)
Part of the Debug Adapter Protocol to help address this is support for the cancel
request. For example, if you hover over a variable then move the cursor off the variable before the response has been sent a cancel request will be made.
Debug Adapter Protocol Cancel Spec
Proposal
I think to address the responsiveness and improve the overall debugging experience we can refactor lldb-dap
to support cancelling requests.
To support the ability to cancel a request we will need to refactor our IO handling and protocol support to break requests into smaller operations that we can try to process as they arrive and support a mechanism for queuing operations.
There is a very similar problem and implementation of a similar protocol in the llvm project in clangd
for handling the Language Server Protocol. In clangd
, there is a task dispatcher that will queue operations on a per document basis. We don’t have a precise equivalent in the debugger world.
I started a prototype here [lldb-dap] Creating a new mechanism for registering async request han… · ashgti/llvm-project@5248b79 · GitHub that splits the lldb_dap::DAP::Loop
into a reader thread and a queue. The queue is currently not yet supporting cancellation but this would be a first step to creating an async flow.
Additionally, for handling requests, I updated the request handler registration to take a callback, for example:
void request_foo(
DAP &dap,
const FooArguments &Args,
Callback<FooResponseBody> Reply) {
// ... do some blocking operation, then schedule an async reply ...
dap.PerformAsyncWork([&](auto result){
// ... Reply off the request evaluation thread ...
Reply(result);
});
}
With this change, we’re more easily able to break up calls into async operations that then call Reply(resultOrError)
.
The basic flow with this change would be to read the requests as they come in, appending them to a queue and then having a task management mechanism to dequeue requests and dispatch them. I think for the first iteration of this change we can only have a single worker evaluating requests. SBDebugger
does have a mechanism for requesting an interrupt SBDebugger::RequestInterrupt()
to make an attempt at interrupting a running operation. We may be able to use the RequestInterrupt
call to stop some inflight operations, but I think that would be something to consider on a case by case basis (e.g. we may not want to cancel an inflight evaluate
request, but an inflight variables
request may be okay to cancel).
One major challenge I’ve come across is while prototyping is needing some way to support either cancelling a read
or to perform a select
style call on the input stream. The existing SelectHelper only supports selecting on Sockets at the moment, not pipes or file handles, this may be something to investigate improving but I’m not as familiar with Windows APIs.