CI: Add LTO support to clang in dist-x86_64-linux by clubby789 · Pull Request #134690 · rust-lang/rust (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation52 Commits1 Checks6 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

clubby789

After rust-lang/cc-rs#1279, we attempt to pass -flto=thin to clang. In dist-x86_64-linux, we don't build clang with the LLVMgold.so library so this fails. This attempts to resolve this
First, pass the binutils plugin include directory to Clang, which will build the library
Second, this library depends on the version of libstdc++ that we built specifically. However, despite both the RPATH and LD_LIBRARY_PATH pointing to /rustroot/lib, we incorrectly resolve to the system libstdc++, which doesn't load.

# LD_DEBUG=libs,files
      2219:    file=libstdc++.so.6 [0];  needed by /rustroot/bin/../lib/LLVMgold.so [0]
      2219:    find library=libstdc++.so.6 [0]; searching
      2219:     search path=/rustroot/bin/../lib/../lib        (RPATH from file /rustroot/bin/../lib/LLVMgold.so)
      2219:      trying file=/rustroot/bin/../lib/../lib/libstdc++.so.6
      2219:     search path=/usr/lib64/tls:/usr/lib64        (system search path)
      2219:      trying file=/usr/lib64/tls/libstdc++.so.6
      2219:      trying file=/usr/lib64/libstdc++.so.6

Using LD_PRELOAD causes it to correctly load the library

I think this is probably not the most maintainable way to do this, so opening to see if this is desired and if there's a better way of doing this

@rustbot

r? @Mark-Simulacrum

rustbot has assigned @Mark-Simulacrum.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added A-testsuite

Area: The testsuite used to check the correctness of rustc

S-waiting-on-review

Status: Awaiting review from the assignee but also interested parties.

T-infra

Relevant to the infrastructure team, which will review and decide on the PR/issue.

labels

Dec 23, 2024

@clubby789

(just to make sure this works on GH too)
@bors try

@bors

bors added a commit to rust-lang-ci/rust that referenced this pull request

Dec 23, 2024

@bors

CI: Add LTO support to clang in dist-x86_64-linux

After rust-lang/cc-rs#1279, we attempt to pass -flto=thin to clang. In dist-x86_64-linux, we don't build clang with the LLVMgold.so library so this fails. This attempts to resolve this First, pass the binutils plugin include directory to Clang, which will build the library Second, this library depends on the version of libstdc++ that we built specifically. However, despite both the RPATH and LD_LIBRARY_PATH pointing to /rustroot/lib, we incorrectly resolve to the system libstdc++, which doesn't load.

# LD_DEBUG=libs,files
      2219:    file=libstdc++.so.6 [0];  needed by /rustroot/bin/../lib/LLVMgold.so [0]
      2219:    find library=libstdc++.so.6 [0]; searching
      2219:     search path=/rustroot/bin/../lib/../lib        (RPATH from file /rustroot/bin/../lib/LLVMgold.so)
      2219:      trying file=/rustroot/bin/../lib/../lib/libstdc++.so.6
      2219:     search path=/usr/lib64/tls:/usr/lib64        (system search path)
      2219:      trying file=/usr/lib64/tls/libstdc++.so.6
      2219:      trying file=/usr/lib64/libstdc++.so.6

Using LD_PRELOAD causes it to correctly load the library

I think this is probably not the most maintainable way to do this, so opening to see if this is desired and if there's a better way of doing this

@bors

☀️ Try build successful - checks-actions
Build commit: 489bb36 (489bb36d8e95310103206a99218960c6bb55bd35)

klensy

@Kobzol

So if I understand it correctly, the cc detection that checks whether the used C/C++ compiler supports LTO says that it doesn't on our CI, and therefore LTO isn't passed to the C/C++ code that we compile? What does gold have to do with that? And also why does it matter if we modify the compilation of the host LLVM? Don't we build our own in-tree LLVM using cmake, instead of cc? Or is this for building other C/C++ code in bootstrap?

It's interesting why the cc bump had such perf. wins if LTO e.g. for jemalloc didn't work. Could it actually be PGO that caused the wins? 🤔

@clubby789

The link in the PR explains some of this. Essentially, we use our compiled clang to build a few things, e.g. the C bindings for rustc_llvm.
As of the recent update, cc attempts to pass our LTO flags to clang, but currently the clang we build doesn't support LTO.

To enable it, we need to build LLVMgold.so, which is a plugin for gold. We do that by passing the binutils plugin include dir when building clang/llvm.

Now I'm not sure why clang is using gold for LTO, given that I'd assume lld supports it? Perhaps some other clang configuration option is needed. But if you run clang -flto=thin test.c without this patch in the Docker container, it will error on searching for LLVMgold

@Kobzol

@rust-timer

Finished benchmarking commit (489bb36): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

mean range count
Regressions ❌ (primary) - - 0
Regressions ❌ (secondary) - - 0
Improvements ✅ (primary) -0.7% [-5.6%, -0.3%] 277
Improvements ✅ (secondary) -0.9% [-2.9%, -0.2%] 238
All ❌✅ (primary) -0.7% [-5.6%, -0.3%] 277

Max RSS (memory usage)

Results (primary -2.5%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌ (primary) - - 0
Regressions ❌ (secondary) - - 0
Improvements ✅ (primary) -2.5% [-3.3%, -1.6%] 3
Improvements ✅ (secondary) - - 0
All ❌✅ (primary) -2.5% [-3.3%, -1.6%] 3

Cycles

Results (primary -4.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌ (primary) - - 0
Regressions ❌ (secondary) - - 0
Improvements ✅ (primary) -4.1% [-7.6%, -1.2%] 11
Improvements ✅ (secondary) - - 0
All ❌✅ (primary) -4.1% [-7.6%, -1.2%] 11

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 763.12s -> 761.61s (-0.20%)
Artifact size: 330.55 MiB -> 336.31 MiB (1.74%)

@clubby789

@Kobzol

Well, it certainly seems like it works, and LTO is now enabled for something.

@clubby789

Profiling locally it seemed like it was mostly jemalloc

@Kobzol

That seems consistent with the last perf. result, and it also makes the most sense, it's probably the most perf. sensitive C/C++ thing that we build (outside of LLVM, of course, but that shouldn't be affected by cc).

@klensy

rustc ? | Binary | 2.16 MiB | 5.03 MiB | 2.87 MiB | 133.336%
rustdoc ? | Binary | 15.52 MiB | 18.39 MiB | 2.87 MiB | 18.506%

almost + 3mb size for rustc/rustdoc weird, are they stripped(no)?

.text section twice big, debug sections too.

Weird, new rustc binary don't have our jemalloc _rjem_ prefixed jemalloc, but still have correct version string 5.3.0-1-ge13ca993e8ccb9ba9847cc330696e02839f328f7.

Okay, but why prefixes was used before?

features = ['unprefixed_malloc_on_supported_platforms']

Kobzol

@Kobzol

┌─────────────────────┬───────────────┬──────────────┬──────────┬──────────┐
│ Sections            │ Size (before) │ Size (after) │     Diff │ Diff (%) │
├─────────────────────┼───────────────┼──────────────┼──────────┼──────────┤
│ .debug_info         │    800.97 KiB │     2.12 MiB │ +1399669 │  +170.7% │
│ .debug_loclists     │    384.40 KiB │     1.14 MiB │  +799812 │  +203.2% │
│ .debug_line         │    236.25 KiB │   559.95 KiB │  +331469 │  +137.0% │
│ .text               │    216.98 KiB │   477.87 KiB │  +267157 │  +120.2% │
│ .debug_rnglists     │     62.10 KiB │   187.06 KiB │  +127953 │  +201.2% │
│ .debug_addr         │     82.10 KiB │   174.95 KiB │   +95080 │  +113.1% │
│ .debug_str          │     85.41 KiB │    74.69 KiB │   -10979 │   -12.6% │
│ .debug_str_offsets  │    119.21 KiB │   129.04 KiB │   +10060 │    +8.2% │
│ .debug_abbrev       │     49.41 KiB │    55.84 KiB │    +6587 │   +13.0% │
│ .strtab             │     35.85 KiB │    30.68 KiB │    -5300 │   -14.4% │
│ .symtab             │     31.27 KiB │    28.05 KiB │    -3288 │   -10.3% │
│ .eh_frame           │     30.61 KiB │    27.80 KiB │    -2868 │    -9.2% │
│ .relro_padding      │      1.28 KiB │        320 B │     -992 │   -75.6% │
│ .eh_frame_hdr       │      4.98 KiB │     4.21 KiB │     -792 │   -15.5% │
│ .debug_line_str     │      2.43 KiB │     2.73 KiB │     +309 │   +12.4% │
│ .rodata             │     10.16 KiB │     9.99 KiB │     -176 │    -1.7% │
│ .rela.dyn           │     23.30 KiB │    23.16 KiB │     -144 │    -0.6% │
│ .bss                │      2.04 MiB │     2.04 MiB │      -79 │    -0.0% │
│ .data               │         568 B │        528 B │      -40 │    -7.0% │
│ .data.rel.ro        │     18.59 KiB │    18.57 KiB │      -16 │    -0.1% │
│ .got                │         104 B │         88 B │      -16 │   -15.4% │
│ <19 unchanged rows> │      9.82 KiB │     9.82 KiB │        0 │     0.0% │
│─────────────────────│───────────────│──────────────│──────────│──────────│
│ Total               │      4.19 MiB │     7.07 MiB │ +3013406 │   +68.5% │
└─────────────────────┴───────────────┴──────────────┴──────────┴──────────┘

There are a few weird things indeed. It really looks like the _rjem prefix is gone, but we shouldn't have been using it even before.. The increase in .text seems kind of expected, in theory, the code can be bigger after LTO, a bunch of stuff apparently got inlined into the jemalloc allocation functions.

It also seems like debuginfo somehow gets duplicated, but I have no idea why is that, that probably warrants further investigation.

The perf. results are nice enough that I don't think that we need to block this PR on that investigation, though.

@Kobzol

It would be nice to find out where do these binary size regressions come from, but I don't think that we need to hold this PR on that. So unless you want to investigate further, you can r=me.

@clubby789

@lqd

Weren’t we stripping debuginfo from the driver?

@Kobzol

The debuginfo seems to be also in rustc, which I'm not sure if is expected. I trippes stripping debuginfo from rustc and the result binary had 600 KiB! Maybe we could just strip it, but we'd need to check if ICEs don't regress.

@lqd

Member

lqd commented

Dec 26, 2024

• Loading

Maybe we didn't strip it because there wasn't a lot of debuginfo there then? I'm pretty sure we discussed this before but it wasn't that big of a win, and maybe these bins have ballooned in size since then (another use case for artifact size history and graph in rustc-perf). (update: it seems not, they've started at around 2.9 or so when we started recording artifact sizes, and have reduced since then...)

4MBs is huge for a single function + allocator override. But also I'm not sure what you were looking at, https://perf.rust-lang.org/compare.html?tab=artifact-size shows rustc at 2.5MiB (which is still huge) and this PR's at 5MiB. Maybe rerun your binary size command on its CI artifacts.

rustc's main is where we override the allocator to jemalloc so it's not crazy that more of jemalloc's code and data would show up in rustc's binary -- like rustdoc, but the code is weirdly inside librustc and not the binary launcher; miri should be setup like rustc and will likely also see the same size increase. I don't know about clippy.

It could be a new difference due to the cc PR (which was a 10% size increase for rustc) that this PR would surface more, e.g. some config expectation mismatch. We should:

@klensy

Stats in jemalloc is weird: config will disable only part of it, while other parts still will be present in binary.

@clubby789

It seems like jemalloc (the C library) is unconditionally built with -g3. jemalloc/jemalloc#2333. If I modify the jemalloc configuration to not provide a -g3 flag, the produced static library drops from ~20mb to ~1mb.
I guess we could either make a PR to tikv-jemalloc-sys to make this configurable, or just strip -d our final artifacts (it looks like pretty much all the debuginfo is indeed from jemalloc)

@rust-log-analyzer

This comment has been minimized.

@Kobzol

Ah, the build-gcc.sh script is used in more jobs than the x64 dist, I thought that we reverted that, but this refactoring was done sooner. Could you please add the environment variable to all Dockerfiles that use the script?

bors added a commit to rust-lang-ci/rust that referenced this pull request

Dec 27, 2024

@bors

Strip debuginfo from rustc-main and rustdoc

r? @Kobzol Split from rust-lang#134690

@clubby789

@clubby789

@bors

bors added a commit to rust-lang-ci/rust that referenced this pull request

Dec 27, 2024

@bors

CI: Add LTO support to clang in dist-x86_64-linux

After rust-lang/cc-rs#1279, we attempt to pass -flto=thin to clang. In dist-x86_64-linux, we don't build clang with the LLVMgold.so library so this fails. This attempts to resolve this First, pass the binutils plugin include directory to Clang, which will build the library Second, this library depends on the version of libstdc++ that we built specifically. However, despite both the RPATH and LD_LIBRARY_PATH pointing to /rustroot/lib, we incorrectly resolve to the system libstdc++, which doesn't load.

# LD_DEBUG=libs,files
      2219:    file=libstdc++.so.6 [0];  needed by /rustroot/bin/../lib/LLVMgold.so [0]
      2219:    find library=libstdc++.so.6 [0]; searching
      2219:     search path=/rustroot/bin/../lib/../lib        (RPATH from file /rustroot/bin/../lib/LLVMgold.so)
      2219:      trying file=/rustroot/bin/../lib/../lib/libstdc++.so.6
      2219:     search path=/usr/lib64/tls:/usr/lib64        (system search path)
      2219:      trying file=/usr/lib64/tls/libstdc++.so.6
      2219:      trying file=/usr/lib64/libstdc++.so.6

Using LD_PRELOAD causes it to correctly load the library

I think this is probably not the most maintainable way to do this, so opening to see if this is desired and if there's a better way of doing this

try-job: dist-i686-linux

@bors

☀️ Try build successful - checks-actions
Build commit: 4ce5e49 (4ce5e497c590e4e03fea30dbd3612b609ed336a8)

@clubby789

Looks like i686 is okay now

bors added a commit to rust-lang-ci/rust that referenced this pull request

Dec 27, 2024

@bors

Strip debuginfo from rustc-main and rustdoc

r? @Kobzol Split from rust-lang#134690

@Kobzol

@bors

📌 Commit 9e57593 has been approved by Kobzol

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors

Status: Waiting on bors to run and complete tests. Bors will change the label on completion.

and removed S-waiting-on-review

Status: Awaiting review from the assignee but also interested parties.

labels

Dec 27, 2024

@bors

@bors

@rust-timer

Finished benchmarking commit (ecc1899): comparison URL.

Overall result: ✅ improvements - no action needed

@rustbot label: -perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

mean range count
Regressions ❌ (primary) - - 0
Regressions ❌ (secondary) - - 0
Improvements ✅ (primary) -0.7% [-5.5%, -0.3%] 275
Improvements ✅ (secondary) -0.9% [-2.6%, -0.2%] 235
All ❌✅ (primary) -0.7% [-5.5%, -0.3%] 275

Max RSS (memory usage)

This benchmark run did not return any relevant results for this metric.

Cycles

Results (primary -2.4%, secondary -2.4%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌ (primary) - - 0
Regressions ❌ (secondary) - - 0
Improvements ✅ (primary) -2.4% [-4.1%, -0.7%] 2
Improvements ✅ (secondary) -2.4% [-2.4%, -2.4%] 1
All ❌✅ (primary) -2.4% [-4.1%, -0.7%] 2

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 763.214s -> 761.661s (-0.20%)
Artifact size: 325.15 MiB -> 325.61 MiB (0.14%)

poliorcetics pushed a commit to poliorcetics/rust that referenced this pull request

Dec 28, 2024

@bors @poliorcetics

Strip debuginfo from rustc-main and rustdoc

r? @Kobzol Split from rust-lang#134690

poliorcetics pushed a commit to poliorcetics/rust that referenced this pull request

Dec 28, 2024

@bors @poliorcetics

CI: Add LTO support to clang in dist-x86_64-linux

After rust-lang/cc-rs#1279, we attempt to pass -flto=thin to clang. In dist-x86_64-linux, we don't build clang with the LLVMgold.so library so this fails. This attempts to resolve this First, pass the binutils plugin include directory to Clang, which will build the library Second, this library depends on the version of libstdc++ that we built specifically. However, despite both the RPATH and LD_LIBRARY_PATH pointing to /rustroot/lib, we incorrectly resolve to the system libstdc++, which doesn't load.

# LD_DEBUG=libs,files
      2219:    file=libstdc++.so.6 [0];  needed by /rustroot/bin/../lib/LLVMgold.so [0]
      2219:    find library=libstdc++.so.6 [0]; searching
      2219:     search path=/rustroot/bin/../lib/../lib        (RPATH from file /rustroot/bin/../lib/LLVMgold.so)
      2219:      trying file=/rustroot/bin/../lib/../lib/libstdc++.so.6
      2219:     search path=/usr/lib64/tls:/usr/lib64        (system search path)
      2219:      trying file=/usr/lib64/tls/libstdc++.so.6
      2219:      trying file=/usr/lib64/libstdc++.so.6

Using LD_PRELOAD causes it to correctly load the library

I think this is probably not the most maintainable way to do this, so opening to see if this is desired and if there's a better way of doing this

wip-sync pushed a commit to NetBSD/pkgsrc-wip that referenced this pull request

Feb 23, 2025

@he32

Pkgsrc changes relative to rust184:

Version 1.85.0 (2025-02-20)

Language

Compiler

Platform Support

Refer to Rust's [platform support page][platform-support-doc] for more information on Rust's tiered platform support.

Libraries

Stabilized APIs

These APIs are now stable in const contexts:

Cargo

Rustdoc

Compatibility Notes

Internal Changes

These changes do not affect any public interfaces of Rust, but they represent significant improvements to the performance or internals of rustc and related tools.

tmeijn pushed a commit to tmeijn/dotfiles that referenced this pull request

Feb 26, 2025

@tmeijn

This MR contains the following updates:

Package Update Change
rust minor 1.84.1 -> 1.85.0

MR created with the help of el-capitano/tools/renovate-bot.

Proposed changes to behavior should be submitted there as MRs.


Release Notes

rust-lang/rust (rust)

v1.85.0

Compare Source

==========================

Language

Compiler

Platform Support

Refer to Rust's [platform support page][platform-support-doc] for more information on Rust's tiered platform support.

Libraries

Stabilized APIs

These APIs are now stable in const contexts:

Cargo

Rustdoc

Compatibility Notes

Internal Changes

These changes do not affect any public interfaces of Rust, but they represent significant improvements to the performance or internals of rustc and related tools.


Configuration

📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

Rebasing: Whenever MR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this MR and you won't be reminded about this update again.



This MR has been generated by Renovate Bot.

Labels

A-testsuite

Area: The testsuite used to check the correctness of rustc

merged-by-bors

This PR was explicitly merged by bors.

relnotes-perf

Performance improvements that should be mentioned in the release notes.

S-waiting-on-bors

Status: Waiting on bors to run and complete tests. Bors will change the label on completion.

T-bootstrap

Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap)

T-infra

Relevant to the infrastructure team, which will review and decide on the PR/issue.