Improve ContinueWith perf with NotOn* options by stephentoub · Pull Request #35575 · dotnet/runtime (original) (raw)

This came about while looking at allocations and CPU costs in some HttpClient code. When ContinueWith is used with TaskContinuationOptions.NotOn* options, when the antecedent task completes we compare the state of that antecedent task against the NotOn* options: if the options permit it, the continuation is queued/invoked, and if they don't, the continuation is canceled. That cancellation then ends up being common in cases where a ContinueWith is used, for example, to log exceptions that result from a faulted task, e.g.

private static void Log(Task task) => task.ContinueWith(t => Log(t.Exception), CancellationToken.None, TaskContinuationOptions.OnlyOnFaulted | TaskContinuationOptions.ExecuteSynchronously, TaskScheduler.Default);

We can handle that cancellation much more efficiently than we are today. Today that's resulting in us expanding the ContingentProperties inside of the task, in order to store that cancellation was requested internally, but that's not actually necessary. It's also resulting in us doing several atomic transitions via interlockeds, but that's only necessary if the task could have transitioned in any way, which is only possible if a cancelable token is provided.

This PR removes that overhead. It also shrinks the size of the object created by ContinueWith by a field, seals and renames it, and removes some dead code from related code paths.

Method	Toolchain	Mean	Error	StdDev	Ratio	Gen 0	Gen 1	Allocated
Cancel	\master	140.8 ns	0.49 ns	0.46 ns	1.00	0.0300	0.0010	192 B
Cancel	\pr	106.4 ns	0.95 ns	0.79 ns	0.76	0.0160	0.0010	104 B
Execute	\master	157.2 ns	0.45 ns	0.38 ns	1.00	0.0170	0.0010	112 B
Execute	\pr	156.0 ns	0.60 ns	0.53 ns	0.99	0.0160	0.0010	104 B

using BenchmarkDotNet.Attributes; using BenchmarkDotNet.Running; using System; using System.Runtime.CompilerServices; using System.Threading; using System.Threading.Tasks;

[MemoryDiagnoser] public class Program { static void Main(string[] args) => BenchmarkSwitcher.FromAssemblies(new[] { typeof(Program).Assembly }).Run(args);

const int Iters = 1_000_000;

private AsyncTaskMethodBuilder[] tasks = new AsyncTaskMethodBuilder[Iters];

[IterationSetup]
public void Setup()
{
    Array.Clear(tasks, 0, tasks.Length);
    for (int i = 0; i < tasks.Length; i++)
        _ = tasks[i].Task;
}

[Benchmark(OperationsPerInvoke = Iters)]
public void Cancel()
{
    for (int i = 0; i < tasks.Length; i++)
    {
        tasks[i].Task.ContinueWith(_ => { }, CancellationToken.None, TaskContinuationOptions.OnlyOnFaulted | TaskContinuationOptions.ExecuteSynchronously, TaskScheduler.Default);
        tasks[i].SetResult();
    }
}

[Benchmark(OperationsPerInvoke = Iters)]
public void Execute()
{
    for (int i = 0; i < tasks.Length; i++)
    {
        tasks[i].Task.ContinueWith(_ => { }, CancellationToken.None, TaskContinuationOptions.ExecuteSynchronously, TaskScheduler.Default);
        tasks[i].SetResult();
    }
}

}

cc: @kouvel, @tarekgh