Make #[used]
work when linking with ld64
by madsmtm · Pull Request #133832 · rust-lang/rust (original) (raw)
[...]
When I build thefoo.c
to thelibfoo.a
,foo.o
that inlibfoo.a
never be loaded by any linker. For fix this, we added asymbols.o
to help the linker load such object files.
That's correct.
I'm concerned about the current implementation. It seems we're building our logic around ld64's internal implementation details, and ld64 isn't truly open source. (I haven't looked into ld64's implementation details yet. I'm not sure if this approach is fragile or will be difficult to maintain in the future.)
I agree that it is fragile, but I don't think it is as bad as it looks; consider that linkers are mostly backwards compatible, and that it works now (whereas before it just didn't work at all, in no versions of ld64). I honestly don't think Apple is gonna change how this works, and if they do, we'll have the Xcode betas to fix it.
The only thing I can conceivably think of that might break is:
- If symbols were treated differently depending on their type (function vs. static).
- If the linker figures out that the data that the relocation refers to is never actually used.
I guess if we wanted to make it even more robust, we'd do something like:
// symbols.rs extern crate crate1; extern crate crate2; extern crate crate3;
fn _used_symbols() { let _static1 = crate1::STATIC1; let _static2 = crate2::STATIC2; let _fn3 = crate3::fn3; }
// rustc symbols.rs --emit=obj
That is, emit a label that refers to the block of code that touches all symbols, such that the linker cannot assume the section to be unused. The roughly equivalent could be done with object-rs
like so:
file.add_symbol(write::Symbol { name: "_used_symbols".into(), value: 0, size: 0, kind: SymbolKind::Text, scope: SymbolScope::Dynamic, weak: false, section: write::SymbolSection::Section(section_id), flags: SymbolFlags::None, });
Was that clear? I can try to go in more detail here if you want? Or try to conjure up some contrived assembly code, and consider how ld64 would have to link that now and in the future?
IIUC, the key here is to have the linker load object files as intended, just like when we directly use
ld foo.o
. So why don't we just add these object files directly?
I can see two ways to do that:
- Avoid archives altogether, just pass each object file in every crate to the linker (or maybe combine the different object files generated by each codegen unit into a "
libfoo.o
"). - Link just the object files that use
#[used]
. Requires somehow extracting the relevant object files from the archive at link time, or maybe doing it while building the crate itself? E.g. createlibfoo.a
andfoo_used_symbols.o
.
The former is bad for link-time performance (the linker can skip a lot of work for archives that if can't for object files), and the latter is either also bad for perf, or makes the integration between rustc
and Cargo even more complex than it already is.
If we don't want to do this, how about adding the right value of
#[used]
tosymbols.o
? Forstatic PUSH: extern "C" fn() = push;
, we add "push
" tosymbols.o
. That looks workable, as long aspush
always be a global symbol and bothPUSH
andpush
stay in the same object file.
Hmm, not sure I understand?
Note that we'd still want e.g. #[used] #[link_section = "..."] static FOO: i32 = 2;
to be seen by the linker, and here there's no "inner" symbol to refer to?