Improve ContinueWith perf with NotOn* options by stephentoub · Pull Request #35575 · dotnet/runtime (original) (raw)

This came about while looking at allocations and CPU costs in some HttpClient code. When ContinueWith is used with TaskContinuationOptions.NotOn* options, when the antecedent task completes we compare the state of that antecedent task against the NotOn* options: if the options permit it, the continuation is queued/invoked, and if they don't, the continuation is canceled. That cancellation then ends up being common in cases where a ContinueWith is used, for example, to log exceptions that result from a faulted task, e.g.

private static void Log(Task task) => task.ContinueWith(t => Log(t.Exception), CancellationToken.None, TaskContinuationOptions.OnlyOnFaulted | TaskContinuationOptions.ExecuteSynchronously, TaskScheduler.Default);

We can handle that cancellation much more efficiently than we are today. Today that's resulting in us expanding the ContingentProperties inside of the task, in order to store that cancellation was requested internally, but that's not actually necessary. It's also resulting in us doing several atomic transitions via interlockeds, but that's only necessary if the task could have transitioned in any way, which is only possible if a cancelable token is provided.

This PR removes that overhead. It also shrinks the size of the object created by ContinueWith by a field, seals and renames it, and removes some dead code from related code paths.

Method Toolchain Mean Error StdDev Ratio Gen 0 Gen 1 Allocated
Cancel \master 140.8 ns 0.49 ns 0.46 ns 1.00 0.0300 0.0010 192 B
Cancel \pr 106.4 ns 0.95 ns 0.79 ns 0.76 0.0160 0.0010 104 B
Execute \master 157.2 ns 0.45 ns 0.38 ns 1.00 0.0170 0.0010 112 B
Execute \pr 156.0 ns 0.60 ns 0.53 ns 0.99 0.0160 0.0010 104 B

using BenchmarkDotNet.Attributes; using BenchmarkDotNet.Running; using System; using System.Runtime.CompilerServices; using System.Threading; using System.Threading.Tasks;

[MemoryDiagnoser] public class Program { static void Main(string[] args) => BenchmarkSwitcher.FromAssemblies(new[] { typeof(Program).Assembly }).Run(args);

const int Iters = 1_000_000;

private AsyncTaskMethodBuilder[] tasks = new AsyncTaskMethodBuilder[Iters];

[IterationSetup]
public void Setup()
{
    Array.Clear(tasks, 0, tasks.Length);
    for (int i = 0; i < tasks.Length; i++)
        _ = tasks[i].Task;
}

[Benchmark(OperationsPerInvoke = Iters)]
public void Cancel()
{
    for (int i = 0; i < tasks.Length; i++)
    {
        tasks[i].Task.ContinueWith(_ => { }, CancellationToken.None, TaskContinuationOptions.OnlyOnFaulted | TaskContinuationOptions.ExecuteSynchronously, TaskScheduler.Default);
        tasks[i].SetResult();
    }
}

[Benchmark(OperationsPerInvoke = Iters)]
public void Execute()
{
    for (int i = 0; i < tasks.Length; i++)
    {
        tasks[i].Task.ContinueWith(_ => { }, CancellationToken.None, TaskContinuationOptions.ExecuteSynchronously, TaskScheduler.Default);
        tasks[i].SetResult();
    }
}

}

cc: @kouvel, @tarekgh