Add profiler to bootstrap command by Shourya742 · Pull Request #143525 · rust-lang/rust (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation27 Commits6 Checks11 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

This PR adds command profiling to the bootstrap command. It tracks the total execution time and records cache hits for each command. It also provides the ability to export execution result to a JSON file. Integrating this with Chrome tracing could further enhance observability.

r? @Kobzol

rustbot added the T-bootstrap

Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap)

label

Jul 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I don't think that we'll need JSON output for now, I think that the main output should be some lightly formatted text that will be human interpretable.

This comment has been minimized.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gonna try it tomorrow.

This comment has been minimized.

@Kobzol Kobzol left a comment • Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I tried it locally. Good job, it seem to work as expected! I can already see this becoming quite useful when dealing with reports of bootstrap being slow.

I have some suggestions/feedback:

Streaming Cargo commands (which are the most expensive) ones are not recorded. Could you also add support for profiling these?
The execution and cache hit timestamps are not very useful. I thought they would be, but in this aggregated output they aren't. Also outputting them as UNIX timestamp is not very useful, and we'd have to depend on e.g. time or chrono to improve that, and that's not worth it. You were right when you said that for the trace itself Chrome will be better, I agree now. Sorry! So let's just remove the timestamps and only remember the cache hit counts and the execution duration(s).
Some aggregated stats at the end of the report would be nice. Like the total number of unique commands (fingerprints) that were recorded, the total number of executions (and the aggregated duration of all executions), the total time of bootstrap outside of command executions (so that we can see how large % of bootstrap time was spent on commands specifically - just take a timestamp at the start of bootstrap and when writing the summary, compute the duration, and subtract the sum of command execution durations), the total number of cache hits, and the estimated amount of time saved due to cache hits (take avg. duration of each cached command and multiply it by the number of cache hits it had, and sum that up).

Taking this a bit further, it would be nice to add I/O operations tracing:

Add trace! calls to I/O operations (there is a lot of methods that do file/dir copy, symlinking, network downloads, etc.), so that it can be visualized in Chrome. Also ideally find places where I/O is done ad-hoc, and port it to the I/O helper functions on Builder.
Add these I/O operations to the profile summary, so that we can compare the total durations of I/O and command execution.

But that should be left for a follow-up PR.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Now the output is much cleaner. Some more thoughts:

In addition to max_duration, also print total_duration per command (in most cases, it will be the same as max_duration), and print the % of total bootstrap execution time. So e.g. if bootstrap ran for 5s, and a single command was executed three times and in total it took 3s, it should print 60%. E.g. max_duration=2.06s, total_duration=4.5s (60% of total)

Looks like some of our command parallelization can make the aggregated summary a bit weird :D I thought about this, but haven't expected it to be an issue in practice.

Total time spent in command executions: 2.45s
Total bootstrap time: 2.33s
Time spent outside command executions: 0.00ns

I think it's fine though, no need to handle concurrency here.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Please squash and we can merge this. Great work!

…d start time to deferred execution

…port under bootstrap profiling section

📌 Commit 7de1174 has been approved by Kobzol

It is now in the queue for this repository.

bors added S-waiting-on-bors

Status: Waiting on bors to run and complete tests. Bors will change the label on completion.

and removed S-waiting-on-review

Status: Awaiting review from the assignee but also interested parties.

labels

Jul 10, 2025

What is this?This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.

Comparing 78a6e13 (parent) -> a9f2aad (this PR)

Test differences

Show 2 test diffs

2 doctest diffs were found. These are ignored, as they are noisy.

Test dashboard

Run

cargo run --manifest-path src/ci/citool/Cargo.toml --
test-dashboard a9f2aad0454ef1a06de6588d012517b534540765 --output-dir test-dashboard

And then open test-dashboard/index.html in your browser to see an overview of all executed tests.

Job duration changes

x86_64-apple-2: 3638.3s -> 6118.2s (68.2%)
pr-check-2: 2672.4s -> 2200.5s (-17.7%)
pr-check-1: 1825.9s -> 1507.7s (-17.4%)
x86_64-rust-for-linux: 2994.3s -> 2496.1s (-16.6%)
x86_64-apple-1: 7582.6s -> 8762.2s (15.6%)
x86_64-gnu-llvm-19-1: 3885.0s -> 3371.7s (-13.2%)
dist-apple-various: 5976.0s -> 6751.4s (13.0%)
i686-gnu-2: 6240.6s -> 5491.8s (-12.0%)
i686-gnu-1: 8172.5s -> 7200.2s (-11.9%)
x86_64-gnu-llvm-20-1: 3763.4s -> 3380.9s (-10.2%) How to interpret the job duration changes?

Job durations can vary a lot, based on the actual runner instance
that executed the job, system noise, invalidated caches, etc. The table above is provided
mostly for t-infra members, for simpler debugging of potential CI slow-downs.

Finished benchmarking commit (a9f2aad): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results (primary 4.8%, secondary 5.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean	range	count
Regressions ❌ (primary)	4.8%	[4.8%, 4.8%]	1
Regressions ❌ (secondary)	5.1%	[2.3%, 7.2%]	4
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	4.8%	[4.8%, 4.8%]	1

Cycles

Results (secondary -2.3%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-2.3%	[-2.4%, -2.2%]	2
All ❌✅ (primary)	-	-	0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 465.766s -> 465.348s (-0.09%)
Artifact size: 374.54 MiB -> 374.58 MiB (0.01%)

Labels

A-rustc-dev-guide

Area: rustc-dev-guide

merged-by-bors

This PR was explicitly merged by bors.

S-waiting-on-bors

Status: Waiting on bors to run and complete tests. Bors will change the label on completion.

T-bootstrap

Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap)