Implement simple version of On Stack Replacement (OSR) by AndyAyersMS · Pull Request #32969 · dotnet/runtime (original) (raw)

cc @dotnet/jit-contrib @jkotas @noahfalk

Not "in plan" for 5.0, but would like to get this into the main code base so we can refine and gain experience over time, similar to what we did with tiered compilation.

Currently built by default (for x64), but not enabled by default. There should be minimal impact when not enabled (one extra byte in debug info per method). The plan is to create a special CI leg to enable ongoing testing.

To enable one must set COMPlus_TC_QuickJitForLoops=1 and COMPlus_TC_OnStackReplacement=1.

When enabled, methods with loops get jitted at Tier0 and get extra code for patchpoints; cost is about 2% of Tier0 code size, and also get patchpoint info. Net size impact is estimated to be around 10 bytes of code and data per Tier0 method (~100 bytes for each method with patchpoints).

Many of the file changes are boilerplate. Interesting areas to focus on:

More thinking about trigger policies is needed; what's there now is plausible but has some quirks.

I have run through scenarios reported vs 3.0 where QJFL=1 caused steady-state perf losses, and OSR recovers those. I have also run through scenarios where QJFL=0 caused startup losses, and OSR recovers those.

NYI for non-x64. arm64 would probably come next, and might be challenging given the diversity of stack frame styles.

Passes Tier1 tests on x64 in "aggressive" OSR mode (produce OSR method and transition the first time a patchpoint is reached).