Implement -Zgcc-ld=lld stabilization MCP by lqd · Pull Request #96401 · rust-lang/rust (original) (raw)
Use lld by default on x86_64-unknown-linux-gnu stable
This PR and stabilization report is joint work with @Kobzol.
Use LLD on x86_64-unknown-linux-gnu by default, and stabilize -Clinker-features=-lld and -Clink-self-contained=-linker
This PR proposes making LLD the default linker on the x86_64-unknown-linux-gnu target for the artifacts we distribute, and also stabilizing the -Clinker-features=-lld and -Clink-self-contained=-linker codegen options to make it possible to opt out.
LLD has been used as the default linker on nightly and CI on this target since May 2024 (PR, blog post), and it seems like it is working fine, so we would like to propose stabilizing it.
The main motivation for using LLD instead of the default BFD linker is improving compilation times. For example, in the linked benchmark, it makes incremental recompilation of ripgrep in debug more than twice faster. Another benefit is that Rust compilation becomes more consistent and self-contained, because we will use a known version of the LLD linker, rather than "whatever GNU ld version is on the user's system".
Due to the performance benefit being so huge, many people already opt into using LLD (or other fast linkers, such as mold) using various approaches (1, 2, 3, 4). By making LLD the default linker on the x86_64-unknown-linux-gnu target, we will be able to speed up Rust compilation out of the box, without users having to opt in or know about it.
You can find an extended version of this stabilization report which includes analysis of crater results and more data here.
What is being stabilized
rust-lldbeing used as the default linker on thex86_64-unknown-linux-gnutarget.- Note that
rust-lldis being enabled by default in the compiler artifacts distributed by our CI/rustup. It is still possible to use the system linker by default usingrust.lld = falseinbootstrap.toml, which can be useful e.g. for some Linux distros that might not want to use the LLD we distribute. - This is done by activating the LLD linker feature and using the self-contained linker on that target. Both of which are also usable on the CLI, if some opt outs are necessary, as described below.
- Note that
-Clinker-features=-lldon thex86_64-unknown-linux-gnutarget. This codegen option tells rustc to disable using the LLD linker.- Note that other options for this codegen flag (
cc) remain unstable. - Note that only the opt-out is being stabilized, and only for
x86_64-unknown-linux-gnu: opting in, or using the flag on other targets would still need to pass-Zunstable-options. - This flag is being stabilized so that users can opt out of LLD on stable, which would it turn also opt out of using the self-contained linker (since it's an LLD).
- Note that other options for this codegen flag (
-Clink-self-contained=-linker. This codegen option tells rustc to use the self-contained linker. It's not particularly useful to turn it on by itself, but when enabled and combined with-Clinker-features=+lld, rustc will use therust-lldlinker wrapper shipped with the compiler toolchain, instead of somelldbinary that the linker driver will find in thePATH.- Note that other options for this codegen flag (other than the previously stable
y/yes/n/no). - Note that only the opt-out is being stabilized, and only for
x86_64-unknown-linux-gnu: opting in, or using this flag on other targets would still need to pass-Zunstable-options. - This flag is being stabilized so that users can opt out of using self-contained linking on stable. Doing this would then fall back to using the system
lld.
- Note that other options for this codegen flag (other than the previously stable
To opt out of using LLD, RUSTFLAGS="-Clinker-features=-lld" would be used. To opt out of using rust-lld, falling back to the LLD installed on the system, RUSTFLAGS="-Clink-self-contained=-linker" would be used.
Tests
When enabling rust-lld on nightly, we also switched x64 linux to use it at stage >= 1, meaning that all tests have been running with lld since May 2024, on CI as well as contributors' machines. (Post opt-dist tests also had been using it when running their test subset earlier than that).
There are also a few tests dedicated to the CLI behavior, or ensuring the default linker is indeed the one we expect:
- link-self-contained-consistency: Checks that
-Clink-self-containedoptions are not inconsistent (i.e. that passing both+linkerand-linkeris an error). - link-self-contained-unstable: Checks that only the
-linkerandy/yes/n/nooptions for-Clink-self-containedare stable. - linker-features-unstable-cc: Checks that only the non-lld options of
-Clinker-featuresare unstable. - linker-features-lld-disallowed: Checks that
-Clinker-features=-lldis only stable onx86_64-unknown-linux-gnu. - link-self-contained-linker-disallowed: Checks that
-Clink-self-contained=-linkeris only stable onx86_64-unknown-linux-gnu. - no-gc-encapsulation-symbols: Checks that that linker encapsulation symbols are not garbage collected by LLD, so that crates like linkme still work.
- rust-lld: Checks that LLD is actually used when enabled with
-Clinker-features=+lldand-Clink-self-contained=+linker. - rust-lld-x86_64-unknown-linux-gnu: Checks that LLD is used by default on
x86_64-unknown-linux-gnuwhen the bootstraprust.lldconfig option istrue. - rust-lld-x86_64-unknown-linux-gnu-dist: Dist test that checks that our distributed
x86_64-unknown-linux-gnuarchives actually use LLD by default.
Ecosystem impact
As already stated, LLD has been used as the default linker on x64 Linux on nightly for almost a year, and we haven't seen any blockers to stabilization in that time. There were a handful of issues reported, these are discussed later below.
Furthermore, two crater runs (November 2023, February 2025), were performed to test the impact of using LLD as the default linker. A triage of the earlier crater run was previously done here, but all the important findings from both crater runs are reported below.
Below is a list of compatibility differences between BFD and LLD that we have encountered. There is a more thorough list of differences in this post from the current LLD maintainer. From that post, "99.9% pieces of software work with ld.lld without a change".
.ctors/.dtors sections
#128286 reported an issue where LLD was unable to link certain CUDA library was using these sections that were using the .ctors/.dtors ELF sections. These were deprecated a long time ago (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=46770), replaced with a more modern .init_array/.fini_array sections. LLD doesn't (and won't) support these sections (1, 2), so if they appear in input object files, the linked artifact might produce incorrect behavior, because e.g. some global variables might not get initialized properly.
However, the usage of .ctors/.dtors should be very rare in practice. We have performed a crater run to test this. It has identified only 8 crates where the .ctors/.dtors section is occurring in the final linked artifact. It was caused by a few crates using the .ctors link section manually, and by using a very (~6 year) old version of the ctor crate.
Possible workaround
It is possible to detect if .ctors/.dtors section is present in the final linked artifact (LLD will keep it there, but it won't be populated), and warn users about it. This check is very cheap and doesn't even appear on [perf](#112049 (comment)). We have benchmarked the check on a 240 MiB Chrome binary, where it took 0.8ms with page cache flushed, and 0.06ms with page cache primed (which should be the common case, as the linked artifact is written to disk just before the check is performed).
In theory, this could be also solved with a linker script that moves .ctors to .init_array.
We think that these sections should be so rare that it is not worth it to implement any workarounds for now.
Different garbage collection behavior
#130397 reported an issue where LLD prunes a local symbol, so it is missing in the linked artifact. However, BFD keeps the same symbol, so it is a regression. This is caused by a difference in linker garbage collection.
Rust uses --gc-sections and puts each function into a separate linker section, which prunes unused code. There is some code (specifically the somewhat popular linkme crate) that (arguably ab-)uses so called linker encapsulation symbols to achieve distributed slices.
BFD (2.37+) uses a conservative linking mode that works as intended with this behavior, but it might slightly increase binary size of the linked artifact. LLD does not use this workaround by default, which causes the sections to be eliminated, but it can be made to use the conservative mode using -z nostart-stop-gc.
To avoid this issue, we told LLD to use the conservative mode, which maintains backwards compatibility with BFD. We found that it has [no effect](#112049 (comment)) on compilation performance and binary size in our benchmark suite. With this change, linkme works. Since then, #140872 removed linkme distributed slice's dependence on conservative GC behavior, so this PR also removes that conservative mode: no transition period is necessary, as the PR immediately fixed the crate with no source changes.
Various uncommon issues
A small number of issues that only occurred in a handful of instances were found in crater, and it is unclear if LLD is at fault or if there is some other issue that was not detected with BFD.
You can examine these here.
Missing jobserver support
LLD doesn't support the jobserver protocol for limiting the number of threads used, it simply defaults to using all available cores, and is one of the reasons why it's faster than BFD. However, this should mostly be a non-issue, because most of the linking done during high parallelism sections of cargo build is linking of build scripts and proc macros, which are typically very fast to link (e.g. ~50ms), and a potential oversubscription of cores thus doesn't hurt that much.
When the final artifact is linked (which typically takes the most time), there should be no other sources of parallelism conflicts from compiling other code, so LLD should be able to use all available threads.
That being said, it is a difference of behavior, where previously a -j flag was generally not using more cpu than the specified limit. It can be impactful in some resource-constrained systems, but to be clear that is already the case today due to cargo parallelism. This could be one reason to opt out of using rust-lld on some systems.
LLD has support for limiting the number of threads to use, so in theory rustc could try to get all the jobserver tokens available and use that as lld's thread limit. It'd still be suboptimal as new tokens would not be dynamically detected, and we could be using less threads than available.
We did a benchmark on a real-world crate that shows that using multiple LLD threads for intermediate artifacts doesn't seem to have a performance effect. You can find it here.
Opting out of LLD in the ecosystem
We have also examined repositories where people opted out of LLD on nightly, using this GitHub query. The summary can be found below:
Summary of LLD opt outs
This examination was performed on 2025-03-09.
Here we briefly examine the most common reasons why people use -Zlinker-features=-lld, based on comments and git history.
- Nix/NixOS (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
- There was an issue with LLD, which seems to have been fixed with NixOS/nixpkgs#314268. It's unclear whether that fixed all the Nix issues though.
- Issues with linkme (1, 2, 3, 4, 5, 6, 7)
- These should be resolved with the conservative garbage collection (#137685).
- Bazel (1), WASM (1, 2), uncategorized (2, 3, 4, 5, 6)
- Reason unclear.
History
The idea to use a faster linker by default has been on the radar for quite some time (#39915, #71515). There were very early attempts to use the gold linker by default, but these had to be reverted because of compatibility issues. Support for LLD was implemented back in 2017, but it has not been made default yet, except for some more niche targets, such as WASM, ARM Cortex or RISC-V.
It took quite some time to figure out how should the interface for selecting the linker (and the way it is invoked) look like, as it differs a lot between different platforms, linkers and compiler drivers. During that time, LLD has matured and achieved almost perfect compatibility with the default Linux linker (BFD).
#56351 stabilized
-Clinker-flavor, which is used to determine how to invoke the linker. It is especially useful on targets where selecting the linker directly with-Clinkeris not possible or is impractical.- December 2018, author
@davidtwco,reviewer@nagisa
- December 2018, author
#76158 stabilized
-Clink-self-contained=[y|n], which allows overriding the compiler's heuristic for deciding whether it should use self-contained or external tools (linker, sanitizers, libc, etc.). It only allowed using the self-contained mode either for everything (y) or nothing (n), but did not allow granular choice.- September 2020, author
@mati864,reviewer@petrochenkov
- September 2020, author
#85961 implemented the
-Zgcc-ldflag, which was a hacky way of opting into LLD usage.- June 2021, author
@sledgehammervampire,reviewer@petrochenkov
- June 2021, author
MCP 510 proposed stabilizing the behavior of
-Zgcc-ldusing more granular flags (-Clink-self-contained=linker -Clinker-flavor=gcc-lld).- Initially implemented in #96827, but
@petrochenkov[suggested](#96827 (comment)) a slightly different approach. - The PR was split into #96884, where it was decided what will be the individual components of
-Clink-self-contained=linker. - And #96401, which implemented the
-Clinker-flavorpart. - The MCP was finally implemented in #112910.
- #116514 then removed
-Zgcc-ld, as it was replaced by-Clinker-flavor=gnu-lld-cc+-Clink-self-contained=linker. - April 2022 - October 2023, author
@lqd,reviewer@petrochenkov
- Initially implemented in #96827, but
Various linker handling refactorings were performed in the meantime: #97375, #98212, #100126, #100552, #102836, #110807, #101988, #116515
The implementation of linker flavors with LLD was causing a sort of a combinatorial explosion of various options. #119906 suggested a different approach for linker flavors (described [here](#119906 (comment))), where the individual flavors could be enabled separately using
+/-(e.g.+lld).- After some back and forth, this idea was moved to
-Clinker-features(see [comment 1](#119906 (comment)) and [comment 2](#119906 (comment))), which was implemented in #123656. - April 2024, author
@lqd,reviewer@petrochenkov
- After some back and forth, this idea was moved to
#124129 enabled LLD by default on nightly.
- April 2024, author
@lqd,reviewer@petrochenkov
- April 2024, author
#137685, #137926 enabled the conservative gargage collection mode (
-znostart-stop-gc) to improve compatibility with BFD.- February 2025, author
@lqd,reviewer@petrochenkov(implementation), author@kobzol,reviewer@lqd(test)
- February 2025, author
#96025 (April 2022), #117684 (November 2023), #137044 (February 2025): crater runs.
Unresolved questions/concerns
- Is changing the linker considered a breaking change? In (hopefully very rare) cases, it might break some existing code. It should mostly only affect the final linked artifact, so it should be easy to opt out.
- Similarly, is the single-threaded behavior of such tools encompassed in our stability guarantee: it can be observed via the
-jjob limit (though I believe we have/had some open issues on sometimes using more CPU resources than the job count limit implied). As mentioned above, LLD does not support the jobserver protocol. - A concern [was raised](#71515 (comment)) about increased memory usage of LLD. We should probably let users know about the possibly increased memory usage, and jobserver incompatibility: we did so when announcing this landing on nightly.
- LLD seems to produce slightly larger binary artifacts. This can be partially clawed back using Identical Code Folding (
-Clink-args=-Wl,--icf=all). - Should we detect the outdated
.ctors/.dtorssections to provide a better error message, even if that should be rare in practice?
Next steps
After the FCP completes:
- we should land this PR at the beginning of a beta cycle, to maximize time for testing
- keep an eye on the beta crater run results for possible linker issues (or do a dedicated beta crater run with only this change)
- release a blog post announcing the change, and asking for testing feedback of the appropriate beta
- depending on feedback, or if a period of testing of 6 weeks is not long enough, we could keep this change on beta for another cycle
Development, testing, try builds were done in #138645.
r? @petrochenkov
@rustbot label +needs-fcp +T-compiler
try-job: aarch64-gnu
try-job: i686-gnu-*