<regex>: Perform simplified stack unwinding for lookahead assertions when the asserted pattern matches by muellerj2 · Pull Request #5835 · microsoft/STL (original) (raw)

This implements simplified backtracking for the case when the pattern of a lookahead assertion matches. It's kind of the equivalent of #5828 for lookahead assertions, though it's more complicated while being much less practically relevant. But I need this for the next PRs that will greatly reduce the number of allocations the matcher performs.

When the pattern in a lookahead assertion matches, we know that the lookahead assertion as a whole succeeded or failed. We can then mostly skip the stack unwinding up until the stack frame that was pushed at the start of the lookahead assertion, except for the effects these stack frames have on the stack counter, because no stack unwinding opcode translates does any other work when a pattern matched in ECMAScript mode (and ECMAScript is the only regex grammar that supports lookahead assertions). Much of of the work at the end of a lookahead assertion is now also handled when processing the _N_end_assert node and no longer when processing the unwinding opcodes _After_assert and _After_neg_assert.

You might notice that we could actually avoid the new loop in _N_end_assert if we kept track of the stack usage counts and the positions of the _After_assert and _After_neg_assert stack frames. But I will have to add a variant of this loop in the PR after the next one anyway, so it doesn't seem worth it to spend much effort on avoiding this loop.