[Core] Enable multiple profiler consumers and add a timeline/tracing profiler by froce · Pull Request #1788 · stride3d/stride (original) (raw)

I am basically done with my first pass. If the general approach is okay to merge I will add more tests and fix the comments/docs. Everything else can probably be addressed during review or in follow-up PRs.
The remaining problems are mainly around GPU events, but none of it should affect existing functionality.
Currently I map all GPU events to a single thread id, which is certainly not correct, but I also don't see an easy fix for now.
Syncing CPU and GPU timestamps also isn't working correctly, need to investigate if it's just my logic that's wrong or something else:
grafik

Then there's this:
grafik
Total time is 16 ms, but threadpool work takes 123 ms combined ;) Should I add sorting by avg time?

For having variants of the same profiling key attributes should be used. And for things like the GC counts we may need a counter primitive as you mentioned instead.

Do you have an example? I still don't think I get it.

ProfilingKeys are particularly awkward for me when profiling around polymorphism, but I don't have a suggestion how to improve it:

//Before: using (Profiler.Begin($"{child.GetType().Name}.Draw")) { child.Draw(drawContext); } //Good. Every subclass gets it's own key.

//Now: using (Profiler.Begin(DrawChildKey, $"{child.GetType().Name}.Draw")) { child.Draw(drawContext); } //Sadness.

Doing it properly now requires some Dictionary<Type, ProfilingKey>, which seems like a lot of effort for something that should be quick to add and remove again. I guess I'll do it for these base in-engine cases where it makes sense, but it's not ideal.

I think we might be able to separate the message logging from the profiler. There isn't really a good reason for it to be tied together (except being enabled when profiling is enabled) and it will simplify things a bit.

It doesn't have to be tied together, but having an easy way to log some specific (e.g. asset loading time) profiling events to a file/console is nice. I'd just like to do it without bloating all other profiling in the process.

First look at performance:

BenchmarkDotNet=v0.13.2, OS=Windows 10 (10.0.19045.3448) AMD Ryzen 5 3600, 1 CPU, 12 logical and 6 physical cores .NET SDK=8.0.100-preview.7.23376.3 [Host] : .NET 6.0.14 (6.0.1423.7309), X64 RyuJIT AVX2 Job-MBWYOB : .NET 6.0.14 (6.0.1423.7309), X64 RyuJIT AVX2 Job-YPSZPM : .NET 6.0.14 (6.0.1423.7309), X64 RyuJIT AVX2

Method Job NuGetReferences NUM_ENTITIES WithProfiling Mean Error StdDev Median Allocated
Run Job-MBWYOB Stride.Core 4.1.1.1 2000 False 6.412 ms 0.1267 ms 0.2645 ms 6.387 ms 12.08 KB
Run Job-YPSZPM Stride.Core 4.1.2.1 2000 False 6.348 ms 0.1249 ms 0.1870 ms 6.262 ms 12.08 KB
Run Job-MBWYOB Stride.Core 4.1.1.1 2000 True 18.580 ms 0.3451 ms 0.3059 ms 18.606 ms 12.57 KB
Run Job-YPSZPM Stride.Core 4.1.2.1 2000 True 7.184 ms 0.1436 ms 0.2766 ms 7.052 ms 12.39 KB

4.1.1.1 is master with the added profiling keys, 4.1.2.1 is this PR, so we can now add profiling to work happening on the threadpool without trashing performance (on desktop at least...).