Formal support for linking rlibs using a non-Rust linker · Issue #73632 · rust-lang/rust (original) (raw)
I'm working on a major existing C++ project which hopes to dip its toes into the Rusty waters. We:
- Use a non-Cargo build system with static dependency rules (and 40000+ targets)
- Sometimes build a single big binary; sometimes lots of shared objects, unit test executables, etc. - each containing various parts of our dependency tree.
- Perform final linking using an existing C++ toolchain (based on LLVM 11 as it happens)
- Want to have a few Rust components scattered throughout a very deep dependency tree, which may eventually roll up into one or multiple binaries
We can't:
- Switch from our existing linker to
rustc
for final linking. C++ is the boss in our codebase; we're not ready to make the commitment to put Rust in charge of our final linking. - Create a Rust
staticlib
for each of our Rust components. This works if we're using Rust in only one place. For any binary containing several Rust components, there would be binary bloat and potentially violations of the one-definition-rule, by duplication of the Rust stdlib and any diamond dependencies. - Create a single Rust
staticlib
containing all our Rust components, then link that into every binary. That monster static library would depend on many C++ symbols, which wouldn't be present in some circumstances.
We can either:
- Create a Rust
staticlib
for each of our output binaries, usingrustc
and an auto-generated.rs
file containing lots ofextern crate
statements. Or, - Pass the
rlib
for each Rust component directly into the final C++ linking procedure.
The first approach is officially supported, but is hard because:
- We need to create a Rust
staticlib
as part of our C++ tool invocations. This is awkward in our build system. Our C++ targets don't keep track of Rust compiler flags (--target
, etc.) and in general it just feels weird to be doing Rust stuff in C++ targets. - Specifically, we need to invoke a Python wrapper script to consider invoking
rustc
to make astaticlib
for every single one of our C++ link targets. For most of our targets (especially unit test targets) there will be norlibs
in their dependency tree, so it will be a no-op. But the presence of this wrapper script will make Rust adoption appear intrusive, and of course will have some small actual performance cost. - For those link targets which do include Rust code, we'll delay invocation of the main linker whilst we build a Rust static library.
The second approach is not officially supported. An rlib
is an internal implementation format within Rust, and its only client is rustc
. It is naughty to pass them directly into our own linker command line.
But it does, currently, work. It makes our build process much simpler and makes use of Rust less disruptive.
Because external toolchains are not expected to consume rlib
s, some magic is required:
- The final C++ linker needs to pull in all the Rust stdlib
rlib
s, which would be easy apart from the fact they contain the symbol metadata hash in their names. - We need to remap
__rust_alloc
to__rdl_alloc
etc.
But obviously the bigger concern is that this is not a supported model, and Rust is free to break the rlib
format at any moment.
Is there any appetite for making this a supported model for those with mixed C/C++/Rust codebases?
I'm assuming the answer may be 'no' because it would tie Rust's hands for future rlib
format changes. But just in case: how's about the following steps?
- The Linkage section of the Rust reference is enhanced to list the two current strategies for linking C++ and Rust. Either:
- Use
rustc
as the final linker; or - Build a Rust
staticlib
orcdylib
then pass that to your existing final linker
(I think this would be worth explicitly explaining anyway, so unless anyone objects, I may raise a PR)
- Use
- A new
rustc --print stdrlibs
(or similar) which will output the names of all the standard library rlibs (not just their directory, which is already possible withtarget-libdir
) - Some kind of new
rustc
option which generates arust-dynamic-symbols.o
file (or similar) containing the codegen which is otherwise done byrustc
at final link-time (e.g. symbols to call__rdl_alloc
from__rust_alloc
, etc.) - The Linkage section of the book is enhanced to list this as a third supported workflow. (You can use whatever linker you want, but make sure you link to
rust-dynamic-symbols.o
and everything output byrustc --print stdrlibs
) - Somehow, we add some tests to ensure this workflow doesn't break.
A few related issues:
- Add support for splitting linker invocation to a second execution of rustc #64191 wants to split the compile and link phases of rustc. This discussion has spawned from there.
- @dtolnay's marvellous https://github.com/dtolnay/cxx is not quite as optimal as it could be, because users can't use
-Wl,--start-group
,-Wl,--end-group
on the linker line. (Per Add support for splitting linker invocation to a second execution of rustc #64191 (comment)) - the difficulties of using the
staticlib
-per-C++-target model happen to be magnified by rlibs retain reference to proc-macro dependencies - possibly unnecessary? #73047
@japaric @alexcrichton @retep998 @dtolnay I believe this may be the sort of thing you may wish to comment upon! I'm sure you'll come up with reasons why this is even harder than I already think. Thanks very much in advance.