Use BOLT in CI to optimize librustc_driver
by Kobzol · Pull Request #102487 · rust-lang/rust (original) (raw)
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Conversation93 Commits4 Checks0 Files changed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
rustbot added T-bootstrap
Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap)
Relevant to the infrastructure team, which will review and decide on the PR/issue.
labels
Awaiting bors try build completion.
@rustbot label: +S-waiting-on-perf
⌛ Trying commit 543c574ff7b86b3e8641a701d73f0b0c90b85a5d with merge 23a2a394654d7fe2fe4228351ed8485dca6aec98...
bors added the S-waiting-on-author
Status: This is awaiting some action (such as code changes or more information) from the author.
label
This comment has been minimized.
Awaiting bors try build completion.
@rustbot label: +S-waiting-on-perf
⌛ Trying commit 799dd5fc0c85929b600f677e5a1df07c7df3343c with merge 9f86f5f6789c609097d1af2b3c7fb1762d175dc7...
☀️ Try build successful - checks-actions
Build commit: 9f86f5f6789c609097d1af2b3c7fb1762d175dc7 (9f86f5f6789c609097d1af2b3c7fb1762d175dc7
)
Finished benchmarking commit (9f86f5f6789c609097d1af2b3c7fb1762d175dc7): comparison URL.
Overall result: ❌ regressions - ACTION NEEDED
Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.
Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged
along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.
@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf +perf-regression
Instruction count
This is a highly reliable metric that was used to determine the overall result at the top of this comment.
mean1 | range | count2 | |
---|---|---|---|
Regressions ❌ (primary) | 2.2% | [0.2%, 6.9%] | 13 |
Regressions ❌ (secondary) | 1.8% | [0.2%, 6.2%] | 75 |
Improvements ✅ (primary) | - | - | 0 |
Improvements ✅ (secondary) | -2.0% | [-2.0%, -2.0%] | 1 |
All ❌✅ (primary) | 2.2% | [0.2%, 6.9%] | 13 |
Max RSS (memory usage)
Results
This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
mean1 | range | count2 | |
---|---|---|---|
Regressions ❌ (primary) | 3.1% | [1.0%, 6.4%] | 137 |
Regressions ❌ (secondary) | 4.0% | [0.8%, 7.7%] | 141 |
Improvements ✅ (primary) | -3.7% | [-9.7%, -0.8%] | 29 |
Improvements ✅ (secondary) | -5.1% | [-12.6%, -1.5%] | 61 |
All ❌✅ (primary) | 1.9% | [-9.7%, 6.4%] | 166 |
Cycles
Results
This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
mean1 | range | count2 | |
---|---|---|---|
Regressions ❌ (primary) | - | - | 0 |
Regressions ❌ (secondary) | 5.3% | [4.9%, 5.6%] | 3 |
Improvements ✅ (primary) | -2.6% | [-3.8%, -1.5%] | 28 |
Improvements ✅ (secondary) | -3.1% | [-5.8%, -2.2%] | 9 |
All ❌✅ (primary) | -2.6% | [-3.8%, -1.5%] | 28 |
Footnotes
The cycle results are not bad, but nothing too impresive sadly. Maybe this isn't really worth it. I think that the Google team that tried this has similar experiences. I wonder if BOLT is less effective for Rust libraries vs C/C++ in general.
Another reason might be that BOLT is more effective for binaries than for shared libraries. I'll try to statically link rustc and apply BOLT to it, to see what happens.
Another reason might be that BOLT is more effective for binaries than for shared libraries. I'll try to statically link rustc and apply BOLT to it, to see what happens.
I'd love to see results for statically linking rustc in general, to see how much benefit that provides.
There are results of a few different variants involving static linking in #97154
Awaiting bors try build completion.
@rustbot label: +S-waiting-on-perf
bors added a commit to rust-lang-ci/rust that referenced this pull request
Use BOLT in CI to optimize librustc_driver
Based on rust-lang#94381.
r? @ghost
☀️ Try build successful - checks-actions
Build commit: 8debfc8 (8debfc87c56cd261062f8dc42431dbc18d6f9472
)
This comment has been minimized.
Finished benchmarking commit (8debfc8): comparison URL.
Overall result: no relevant changes - no action needed
Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.
@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression
Instruction count
This benchmark run did not return any relevant results for this metric.
Max RSS (memory usage)
Results
This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
mean | range | count | |
---|---|---|---|
Regressions ❌ (primary) | - | - | 0 |
Regressions ❌ (secondary) | - | - | 0 |
Improvements ✅ (primary) | -3.5% | [-3.5%, -3.5%] | 1 |
Improvements ✅ (secondary) | -1.7% | [-2.0%, -1.4%] | 2 |
All ❌✅ (primary) | -3.5% | [-3.5%, -3.5%] | 1 |
Cycles
This benchmark run did not return any relevant results for this metric.
Binary size
This benchmark run did not return any relevant results for this metric.
Bootstrap: 629.545s -> 630.771s (0.19%)
Artifact size: 317.31 MiB -> 354.81 MiB (11.82%)
This comment has been minimized.
bors added a commit to rust-lang-ci/rust that referenced this pull request
Use BOLT in CI to optimize librustc_driver
Based on rust-lang#94381.
r? @ghost
☀️ Try build successful - checks-actions
Build commit: 9928903 (992890374d3eb8d36c30c7d4eea8fe911163d2f6
)
This comment has been minimized.
Finished benchmarking commit (9928903): comparison URL.
Overall result: ❌ regressions - ACTION NEEDED
Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.
Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged
along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.
@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression
Instruction count
This is a highly reliable metric that was used to determine the overall result at the top of this comment.
mean | range | count | |
---|---|---|---|
Regressions ❌ (primary) | 5.9% | [5.6%, 6.2%] | 4 |
Regressions ❌ (secondary) | 2.2% | [0.3%, 5.4%] | 60 |
Improvements ✅ (primary) | - | - | 0 |
Improvements ✅ (secondary) | -0.2% | [-0.2%, -0.2%] | 1 |
All ❌✅ (primary) | 5.9% | [5.6%, 6.2%] | 4 |
Max RSS (memory usage)
Results
This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
mean | range | count | |
---|---|---|---|
Regressions ❌ (primary) | 1.6% | [0.7%, 2.0%] | 11 |
Regressions ❌ (secondary) | 2.9% | [2.4%, 3.7%] | 13 |
Improvements ✅ (primary) | -2.7% | [-6.4%, -0.4%] | 44 |
Improvements ✅ (secondary) | -3.5% | [-8.9%, -0.5%] | 132 |
All ❌✅ (primary) | -1.8% | [-6.4%, 2.0%] | 55 |
Cycles
Results
This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
mean | range | count | |
---|---|---|---|
Regressions ❌ (primary) | - | - | 0 |
Regressions ❌ (secondary) | - | - | 0 |
Improvements ✅ (primary) | -1.8% | [-2.6%, -0.9%] | 61 |
Improvements ✅ (secondary) | -2.1% | [-2.4%, -1.7%] | 9 |
All ❌✅ (primary) | -1.8% | [-2.6%, -0.9%] | 61 |
Binary size
This benchmark run did not return any relevant results for this metric.
Bootstrap: 628.193s -> 626.601s (-0.25%)
Artifact size: 273.37 MiB -> 309.08 MiB (13.06%)
bors added a commit to rust-lang-ci/rust that referenced this pull request
Optimize librustc_driver.so
with BOLT
This PR optimizes librustc_driver.so
on 64-bit Linux CI with BOLT.
Code
One thing that's not clear yet to me how to resolve is how to best pass a linker flag that we need for BOLT (the second commit). It is currently passed unconditionally, which is not a good idea. We somehow have to:
- Only pass it when we actually plan to use BOLT. How to best do that?
config.toml
entry? Environment variable? CLI flag for bootstrap? BOLT optimization is done byopt-dist
, therefore bootstrap doesn't know about it by default. - Only pass it to
librustc_driver.so
(see performance below).
Some discussion of this flag already happened on Zulip.
Performance
Latest perf. results can be found [here](rust-lang#102487 (comment)). Note that instruction counts are not very interesting here, there are only regressions on hello world programs. Probably caused by a larger C++ libstd (?).
Summary:
- ✔️
-1.8%
mean improvement in cycle counts across many primary benchmarks. - ✔️
-1.8%
mean Max-RSS improvement. - ✖️ 34 MiB (+48%) artifact size regression of
librustc_driver.so
.- This is caused by building
librustc_driver.so
with relocations (which are required for BOLT). Hopefully, it will be fixed in the future with BOLT improvements, but now trying to reduce this size increase is tricky. - Note that the size of this file was recently reduced in rust-lang#115554 by pretty much the same amount (33 MiB). So the size after this PR is basically the same as it was for the last ~year.
- This is caused by building
- ✖️ 1.4 MiB (+53%) artifact size regression of
rustc
.- This is annoying and pretty much unnecessary. It is caused by the way relocations are currently applied in this PR, because they are applied both to
librustc_driver.so
(where they are needed) and forrustc
(where they aren't needed), since both are built with a single cargo invocation. We might need e.g. some tricks in the bootstraprustc
shim to only apply the relocation flag for the shared library and not forrustc
.
- This is annoying and pretty much unnecessary. It is caused by the way relocations are currently applied in this PR, because they are applied both to
CI time
CI (try build) got slower by ~5 minutes, which is fine, IMO. It can be further reduced by running LLVM and librustc_driver
BOLT profile gathering at the same time (now they are gathered separately for LLVM and librustc_driver
).
r? @Mark-Simulacrum
Also CC @onur-ozkan,
primarily for the bootstrap linker flag issue.
bors added a commit to rust-lang-ci/rust that referenced this pull request
Optimize librustc_driver.so
with BOLT
This PR optimizes librustc_driver.so
on 64-bit Linux CI with BOLT.
Code
One thing that's not clear yet to me how to resolve is how to best pass a linker flag that we need for BOLT (the second commit). It is currently passed unconditionally, which is not a good idea. We somehow have to:
- Only pass it when we actually plan to use BOLT. How to best do that?
config.toml
entry? Environment variable? CLI flag for bootstrap? BOLT optimization is done byopt-dist
, therefore bootstrap doesn't know about it by default. - Only pass it to
librustc_driver.so
(see performance below).
Some discussion of this flag already happened on Zulip.
Performance
Latest perf. results can be found [here](rust-lang#102487 (comment)). Note that instruction counts are not very interesting here, there are only regressions on hello world programs. Probably caused by a larger C++ libstd (?).
Summary:
- ✔️
-1.8%
mean improvement in cycle counts across many primary benchmarks. - ✔️
-1.8%
mean Max-RSS improvement. - ✖️ 34 MiB (+48%) artifact size regression of
librustc_driver.so
.- This is caused by building
librustc_driver.so
with relocations (which are required for BOLT). Hopefully, it will be fixed in the future with BOLT improvements, but now trying to reduce this size increase is tricky. - Note that the size of this file was recently reduced in rust-lang#115554 by pretty much the same amount (33 MiB). So the size after this PR is basically the same as it was for the last ~year.
- This is caused by building
- ✖️ 1.4 MiB (+53%) artifact size regression of
rustc
.- This is annoying and pretty much unnecessary. It is caused by the way relocations are currently applied in this PR, because they are applied both to
librustc_driver.so
(where they are needed) and forrustc
(where they aren't needed), since both are built with a single cargo invocation. We might need e.g. some tricks in the bootstraprustc
shim to only apply the relocation flag for the shared library and not forrustc
.
- This is annoying and pretty much unnecessary. It is caused by the way relocations are currently applied in this PR, because they are applied both to
CI time
CI (try build) got slower by ~5 minutes, which is fine, IMO. It can be further reduced by running LLVM and librustc_driver
BOLT profile gathering at the same time (now they are gathered separately for LLVM and librustc_driver
).
r? @Mark-Simulacrum
Also CC @onur-ozkan,
primarily for the bootstrap linker flag issue.
bors added a commit to rust-lang-ci/rust that referenced this pull request
…ulacrum
Optimize librustc_driver.so
with BOLT
This PR optimizes librustc_driver.so
on 64-bit Linux CI with BOLT.
Code
One thing that's not clear yet to me how to resolve is how to best pass a linker flag that we need for BOLT (the second commit). It is currently passed unconditionally, which is not a good idea. We somehow have to:
- Only pass it when we actually plan to use BOLT. How to best do that?
config.toml
entry? Environment variable? CLI flag for bootstrap? BOLT optimization is done byopt-dist
, therefore bootstrap doesn't know about it by default. - Only pass it to
librustc_driver.so
(see performance below).
Some discussion of this flag already happened on Zulip.
Performance
Latest perf. results can be found [here](rust-lang#102487 (comment)). Note that instruction counts are not very interesting here, there are only regressions on hello world programs. Probably caused by a larger C++ libstd (?).
Summary:
- ✔️
-1.8%
mean improvement in cycle counts across many primary benchmarks. - ✔️
-1.8%
mean Max-RSS improvement. - ✖️ 34 MiB (+48%) artifact size regression of
librustc_driver.so
.- This is caused by building
librustc_driver.so
with relocations (which are required for BOLT). Hopefully, it will be fixed in the future with BOLT improvements, but now trying to reduce this size increase is tricky. - Note that the size of this file was recently reduced in rust-lang#115554 by pretty much the same amount (33 MiB). So the size after this PR is basically the same as it was for the last ~year.
- This is caused by building
- ✖️ 1.4 MiB (+53%) artifact size regression of
rustc
.- This is annoying and pretty much unnecessary. It is caused by the way relocations are currently applied in this PR, because they are applied both to
librustc_driver.so
(where they are needed) and forrustc
(where they aren't needed), since both are built with a single cargo invocation. We might need e.g. some tricks in the bootstraprustc
shim to only apply the relocation flag for the shared library and not forrustc
.
- This is annoying and pretty much unnecessary. It is caused by the way relocations are currently applied in this PR, because they are applied both to
CI time
CI (try build) got slower by ~5 minutes, which is fine, IMO. It can be further reduced by running LLVM and librustc_driver
BOLT profile gathering at the same time (now they are gathered separately for LLVM and librustc_driver
).
r? @Mark-Simulacrum
Also CC @onur-ozkan,
primarily for the bootstrap linker flag issue.
github-actions bot pushed a commit to rust-lang/miri that referenced this pull request
Optimize librustc_driver.so
with BOLT
This PR optimizes librustc_driver.so
on 64-bit Linux CI with BOLT.
Code
One thing that's not clear yet to me how to resolve is how to best pass a linker flag that we need for BOLT (the second commit). It is currently passed unconditionally, which is not a good idea. We somehow have to:
- Only pass it when we actually plan to use BOLT. How to best do that?
config.toml
entry? Environment variable? CLI flag for bootstrap? BOLT optimization is done byopt-dist
, therefore bootstrap doesn't know about it by default. - Only pass it to
librustc_driver.so
(see performance below).
Some discussion of this flag already happened on Zulip.
Performance
Latest perf. results can be found [here](rust-lang/rust#102487 (comment)). Note that instruction counts are not very interesting here, there are only regressions on hello world programs. Probably caused by a larger C++ libstd (?).
Summary:
- ✔️
-1.8%
mean improvement in cycle counts across many primary benchmarks. - ✔️
-1.8%
mean Max-RSS improvement. - ✖️ 34 MiB (+48%) artifact size regression of
librustc_driver.so
.- This is caused by building
librustc_driver.so
with relocations (which are required for BOLT). Hopefully, it will be fixed in the future with BOLT improvements, but now trying to reduce this size increase is tricky. - Note that the size of this file was recently reduced in rust-lang/rust#115554 by pretty much the same amount (33 MiB). So the size after this PR is basically the same as it was for the last ~year.
- This is caused by building
- ✖️ 1.4 MiB (+53%) artifact size regression of
rustc
.- This is annoying and pretty much unnecessary. It is caused by the way relocations are currently applied in this PR, because they are applied both to
librustc_driver.so
(where they are needed) and forrustc
(where they aren't needed), since both are built with a single cargo invocation. We might need e.g. some tricks in the bootstraprustc
shim to only apply the relocation flag for the shared library and not forrustc
.
- This is annoying and pretty much unnecessary. It is caused by the way relocations are currently applied in this PR, because they are applied both to
CI time
CI (try build) got slower by ~5 minutes, which is fine, IMO. It can be further reduced by running LLVM and librustc_driver
BOLT profile gathering at the same time (now they are gathered separately for LLVM and librustc_driver
).
r? @Mark-Simulacrum
Also CC @onur-ozkan,
primarily for the bootstrap linker flag issue.
Labels
Area: The testsuite used to check the correctness of rustc
Performance regression.
Status: This is awaiting some action (such as code changes or more information) from the author.
Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap)
Relevant to the infrastructure team, which will review and decide on the PR/issue.