Resolve routing RegEx constraint dependency size issue · Issue #46142 · dotnet/aspnetcore (original) (raw)

Routing has a feature named "Route Constraints". One of the options to make a constraint is to add a regular expression to the route, for example: app.MapGet("/posts/{id:regex(^[a-z0-9]+$)}", …). Because these route constraints are inline in the route string, the Regex code needs to always be in the application, in case any of the routes happen to use a regex constraint.

In .NET 7, we added a new feature to Regex: NonBacktracking. This added a considerable amount of code. Depending on the Regex constructor overload used (the one that takes RegexOptions, which ASP.NET Routing uses), this new feature's code will be left in the app, even if the NonBacktracking engine isn't being used.

ASP.NET Routing uses the CultureInvariant and IgnoreCase options when constructing Regex route constraints.

Testing locally, being able to remove the NonBacktracking engine can cut about .8 MB of the 1.0 MB of Regex code out of the app size.

UPDATE 11/30/2022

With the latest NativeAOT compiler changes, here are updated numbers for linux-x64 NativeAOT:

Hello World 3.22 MB (3,381,112 bytes)
new Regex("").IsMatch 3.50 MB (3,680,200 bytes)
new Regex("", RegexOptions).IsMatch 4.33 MB (4,545,960 bytes)

UPDATE 1/18/2023

With the latest NativeAOT compiler changes, here are updated numbers for linux-x64 NativeAOT:

Hello World 2.79 MB (2,925,616 bytes)
new Regex("").IsMatch 3.04 MB (3,193,344 bytes)
new Regex("", RegexOptions).IsMatch 3.82 MB (4,008,288 bytes)

Options

  1. Add new Regex construction APIs that allow for some RegexOptions to be used, but also allows for the NonBacktracking engine to be trimmed. For example: Regex.CreateCompiled(pattern, RegexOptions). This API would throw an exception if RegexOptions.NonBacktracking was passed.
  2. Remove the use of RegexOptions. The IgnoreCase option can be specified as part of the pattern as a Pattern Modifier: (?i). However, CultureInvariant cannot be specified this way.
    • One option is to drop support for CultureInvariant from regex route constraints. This affects the Turkish 'i' handling.
    • Another option could be to add a CultureInvariant Pattern Modifier to .NET Regex, so this could be specified without using RegexOptions.
  3. When using the new "slim" hosting API (See Allow minimal host to be created without default HostBuilder behavior #32485), we could remove the Regex route constraints feature by default and have an API that adds it back. Apps using the inline regex route constraints and the slim hosting API, would call this new API to opt-in to the feature.
  4. Add a feature to the linker/NativeAOT compiler that can see which RegexOptions values are used in the app. And for the enum values that aren't used, it can trim the code branches behind those values (i.e. the NonBacktracking code). This would accrue more size savings elsewhere as well.