STL (original) (raw)

The parser likes to generate single-branch _N_if nodes in the NFA for no good reason. These nodes have a noticeable runtime cost in the matcher. This PR avoids generating these nodes.

This change has side effects:

It strictly reduces the necessary recursion in the matcher to match a specific input string to a specific regex. This means that some strings that ran into a stack overflow or an error_stack exception before will match successfully after this PR.
Conversely, this means that some _Matcher2::_Do_rep() calls can now take the place previously occupied by recursive calls of _Matcher2::_Do_if(). _Matcher2::_Do_rep() takes more bytes from the stack (I didn't check recently, but I remember it was about 100 bytes per call?), so there will be a very small set of input strings that will run into an actual stack overflow now instead of an error_stack exception.

Without further changes, there would be more side effects because some loops would get classified as simple now. However, I preempted this reclassification by adjusting _Parser2::_Calculate_loop_simplicity(). There are two reasons for this.

The simple loop optimization, as implemented, reduces the required amount of stack space but in exchange can square the running time when matching some input strings to a regex. I think, though, that we could take this risk because the input strings couldn't be longer than a few hundred characters anyway before running into a stack overflow, so it's unlikely to make a major runtime difference in practice when matching didn't run into a stack overflow or error_stack exception before, and it's more important to make more regexes resistant against stack overflow. For example, if we were to allow the reclassification, this would fix the stack overflow for the regex in : Grouping within repetition causes regex stack error #997.
The more important reason is that I'm not sure yet what the best subset of simple loops is for a non-recursive matcher, so I would like to retain more wiggle room here until the requirements become clearer while rewriting the matcher.

I added no new tests because (a) there is no obvious additional set of regexes to test and (b) this change affects the generated NFAs for all regexes without exception (all of them contained at least one single-branch _N_if node before), so the whole existing test suite serves as test coverage for this change.

Benchmark

Benchmark	Before	After	Speedup
bm_lorem_search/"bibe"/2	31145 ns	28878 ns	1.08
bm_lorem_search/"bibe"/3	61384 ns	57199 ns	1.07
bm_lorem_search/"bibe"/4	122768 ns	114397 ns	1.07
bm_lorem_search/"(bibe)"/2	57812 ns	43945 ns	1.32
bm_lorem_search/"(bibe)"/3	141246 ns	92072 ns	1.53
bm_lorem_search/"(bibe)"/4	284934 ns	187976 ns	1.52
bm_lorem_search/"(bibe)+"/2	92072 ns	62779 ns	1.47
bm_lorem_search/"(bibe)+"/3	179983 ns	122768 ns	1.47
bm_lorem_search/"(bibe)+"/4	376607 ns	256319 ns	1.47
bm_lorem_search/"(?:bibe)+"/2	55804 ns	48652 ns	1.15
bm_lorem_search/"(?:bibe)+"/3	112305 ns	96257 ns	1.17
bm_lorem_search/"(?:bibe)+"/4	224609 ns	196725 ns	1.14

<regex>: Avoid generating unnecessary single-branch _N_if nodes by muellerj2 · Pull Request #5539 · microsoft/STL (original) (raw)

Benchmark

`<regex>`: Avoid generating unnecessary single-branch `_N_if` nodes by muellerj2 · Pull Request #5539 · microsoft/STL (original) (raw)