Generate synthetic object file to ensure all exported and used symbols participate in the linking by nbdd0121 · Pull Request #95604 · rust-lang/rust (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation115 Commits11 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

nbdd0121

@rustbot rustbot added the T-compiler

Relevant to the compiler team, which will review and decide on the PR/issue.

label

Apr 2, 2022

@petrochenkov

One reason why I like this approach more than #95363 is that I have no idea how widely linker script files like forcelink.ld from #95363 are supported.
LLD, for example, had issues with supporting version script files used for exporting symbols on PE/COFF targets.

A "root" object file, on another hand, is as portable as it gets, with the same logic being reusable for very different targets.

@rust-log-analyzer

This comment has been minimized.

@carbotaniuman

Yeah this seems like a cleaner approach. This should fix the linkme style (test untested on this branch) of test cases as well as the linker-script style ones.

This doesn't fix #47384 fully however as it still requires #[used(linker)], but the combination of both works for all available examples.

@nbdd0121

Yeah this seems like a cleaner approach. This should fix the linkme style (test untested on this branch) of test cases as well as the linker-script style ones.

This doesn't fix #47384 fully however as it still requires #[used(linker)], but the combination of both works for all available examples.

What do you mean? This certainly fixes the issue described by the OP in #47384.

bjorn3

@carbotaniuman

@nbdd0121

What do you mean? This certainly fixes the issue described by the OP in #47384.

It doesn't fix the issue in rust-audit. #47384 has like 4 different issues bundled into it, which makes testing quote difficult.

@nbdd0121

Ah, the rust-audit case does require a custom linker script or #[used(linker)], but I think it's rightfully so.

@rust-log-analyzer

This comment has been minimized.

@petrochenkov

#47384 discusses so many different things that I'd like to set some agreements on what specifically this PR is supposed to assume and fix.

  1. Do not change meaning of #[used].
    Whether #[used] defaults to #[used(compiler)] or #[used(linker)] is a matter for a separate PR, like Only compile #[used] as llvm.compiler.used for ELF targets #93718 or other.
    Right now #[used] defaults to #[used(compiler)], aka CodegenFnAttrFlags::USED, aka the symbols must exist in object files.
    This PR changes the meaning of #[used(compiler)] and makes it closer to #[used(linker)] by marking #[used(compiler)] symbols as exported and keeping them alive for longer, until the linker GC.
  2. Visibility of #[used(...)] symbols.
    RFC: #[used] static variables rfcs#2386 specifies that used is orthogonal to symbols visibility (exported vs everything else non-exported).
    This PR marks both #[used(compiler)] and #[used(linker)] symbols as exported, even if they wouldn't be exported without that, which contradicts the RFC.
    I'm not actually sure whether you can guaranteedly keep a non-exported symbol alive until linker GC, but for #[used(compiler)] symbols that's not necessary and they certainly can stay non-exported as stated in the RFC.
  3. Let's assume -C export-executable-symbols already exists.
    RFC: -C export-executable-symbols rfcs#2841 is an accepted feature, and the implementation seems to be in progress, so we just need to not forget to include it into the picture. Executables with -C export-executable-symbols may end up being identical to cdylibs in these regards though.

@petrochenkov

@nbdd0121
Could you actually split this PR into two?

I'll be able to approve the first PR almost immediately, and it will then unblock #95818.
Then we'll be able to discuss used in more detail.

@nbdd0121

#47384 discusses so many different things that I'd like to set some agreements on what specifically this PR is supposed to assume and fix.

  1. Do not change meaning of #[used].
    Whether #[used] defaults to #[used(compiler)] or #[used(linker)] is a matter for a separate PR, like Only compile #[used] as llvm.compiler.used for ELF targets #93718 or other.
    Right now #[used] defaults to #[used(compiler)], aka CodegenFnAttrFlags::USED, aka the symbols must exist in object files.
    This PR changes the meaning of #[used(compiler)] and makes it closer to #[used(linker)] by marking #[used(compiler)] symbols as exported and keeping them alive for longer, until the linker GC.

No. This PR does not change the meaning of #[used(compiler)]. Instead, it fixes the existing bug to ensure that it conforms to the desired behaviour. The PR ensures that #[used(compiler)] reaches the linker, but it doesn't modify the linker behaviour.

  1. Visibility of #[used(...)] symbols.
    RFC: #[used] static variables rfcs#2386 specifies that used is orthogonal to symbols visibility (exported vs everything else non-exported).
    This PR marks both #[used(compiler)] and #[used(linker)] symbols as exported, even if they wouldn't be exported without that, which contradicts the RFC.

No. Neither of these are exported. The export symbol list is not changed at all.

  1. Let's assume -C export-executable-symbols already exists.
    RFC: -C export-executable-symbols rfcs#2841 is an accepted feature, and the implementation seems to be in progress, so we just need to not forget to include it into the picture. Executables with -C export-executable-symbols may end up being identical to cdylibs in these regards though.

See above.

@petrochenkov

Neither of these are exported. The export symbol list is not changed at all.

As I understand the used symbols get into the reachable set (the reachable.rs change), and from there to the exported set, because the exported set is seeded with reachable set (reachable_set -> reachable_non_generics -> exported_symbols ).

@nbdd0121

The #[used] symbols would have export level "Rust" instead of "C" so they won't be exported for cdylib or binary.

@nbdd0121

There is an "exported_symbols" query and a "exported_symbols" function in linker.rs that produces the final exported symbols. They are actually different things; the first contains all reachables, latter does a filter to the former to get an actual list for export.

@petrochenkov

The #[used] symbols would have export level "Rust".

Yes, but they shouldn't have any exported level at all (unless they are exported for other reasons unrelated to used).

the first contains all reachables, latter does a filter to the former to get an actual list for export.

The list of symbols to filter in exported_symbols2 is seeded with the result of exported_symbols1, and the filtering is threshhold-based, so the SymbolExportLevel::Rust symbols will still stay exported in dylibs.

@nbdd0121

The #[used] symbols would have export level "Rust".

Yes, but they shouldn't have any exported level at all (unless they are exported for other reasons unrelated to used).

"Rust" level export will be exported in dylib anyway and we already export an awful lot of symbols in such case.

I don't see why A in

#[used] static A: u32 = 1;

shouldn't be exported while A in

static A: u32 = 1; #[inline] pub fn foo() -> &u32 { &A }

should.

@nbdd0121

In a sense I don't treat "Rust" export level as truly being exports; it's basically just an export level for "everything that might get used" and I view it as an implementation detail to make dylibs work.

@petrochenkov

Ok, point understood.

I agree that in practice it's not too important to prune such used symbols from dylib exports, and we can skip it if it helps to simplify the implementation or fix other more important issues sooner.

I'd still consider it as a rough edge of the implementation, because when spec / mental model say one thing and the implementation do another it often leads to misunderstanding and mess.
Could you add a FIXME comment to reachable.rs telling that after being marked as reachable used symbols will have SymbolExportLevel::Rust and may end up being exported from dylibs, but that's mostly ok?

@petrochenkov

The issue 1 (#[used(compiler)] symbols being made closer to #[used(linker)] due to being over-exposed from dylibs) is a consequence of the issue 2, so the comment from #95604 (comment) will address it as well.

petrochenkov

@petrochenkov

@pcc pcc mentioned this pull request

Oct 13, 2022

@XrXr XrXr mentioned this pull request

Mar 23, 2023

xSetech added a commit to xSetech/rust that referenced this pull request

Jul 9, 2023

@xSetech

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request

Jul 11, 2023

@matthiaskrgr

ivoanjo added a commit to DataDog/dd-trace-rb that referenced this pull request

Feb 6, 2025

@ivoanjo

What does this PR do?

This PR removes the datadog_profiling_loader native extension, as it's no longer needed.

The profiling native loader was added in #2003 .

Quoting from that PR's description:

When Ruby loads a native extension (using require), it uses dlopen(..., RTLD_LAZY | RTLD_GLOBAL) (https://github.com/ruby/ruby/blob/67950a4c0a884bdb78d9beb4405ebf7459229b21/dln.c#L362).

This means that every symbol exposed directly or indirectly by that native extension becomes visible to every other extension in the Ruby process. This can cause issues, see rubyjs/mini_racer#179 .

Ruby's extension loading mechanism is not configurable -- there's no way to tell it to use different flags when calling dlopen. To get around this, this commit introduces introduces another extension (profiling loader) which has only a single responsibility: mimic Ruby's extension loading mechanism, but when calling dlopen use a different set of flags.

This idea was shamelessly stolen from @lloeki's work in rubyjs/mini_racer#179, big thanks!

...and importantly ends with:

Note that, that we know of, the profiling native extension only exposes one potentially problematic symbol: rust_eh_personality (coming from libddprof/libdatadog).

Future versions of Rust have been patched not to expose this (see rust-lang/rust#95604 (comment)) so we may want to revisit the need for this loader in the future, and perhaps delete it if we no longer require its services :)

And we have reached the situation predicted in that description:

  1. Nowadays libdatadog no longer exposes rust_eh_personality

  2. We have a test that validates that only expected symbols are exported by the libdatadog library (see DataDog/libdatadog#573 ).

    Any new symbols that show up would break shipping new libdatadog versions to rubygems.org until we review them.

  3. The libdatadog_api extension, which we've been shipping for customers since release 2.3.0 back in July 2024 has always been loaded directly without a loader without issues.

Thus, I think it's the right time to get rid of the loader.

Motivation:

Having the loader around is not zero cost; we've run into/caused a few issues ( #2250 and #3582 come to mind).

It also adds overhead to development: every time we need to rebuild the extensions, it's an extra extension that needs to be prepared, rebuilt, copied, etc.

I've been meaning to get rid of the loader for some time now, and this came up again this week so I decided it was time to do it.

Additional Notes:

In the future, we can always ressurrect this approach if we figure out we need it again.

We've also discussed internally about proposing upstream to the Ruby VM to maybe add an API to do what the loader was doing, so that we wouldn't need the weird workaround.

We haven't yet decided if we're going to do that (help welcome!).

How to test the change?

The loader was responsible for loading the rest of profiling.

Thus, any test that uses profiling was also validating the loader and now will validate that we're doing fine without it.

TL;DR green CI is good.

bors added a commit to rust-lang-ci/rust that referenced this pull request

Feb 18, 2025

@bors

Make #[used] work when linking with ld64

To make #[used] work in static libraries, we use the symbols.o trick introduced in rust-lang#95604.

However, the linker shipped with Xcode, ld64, works a bit differently from other linkers; in particular, it completely ignores undefined symbols by themselves, and only consider them if they have relocations (something something atoms something fixups, I don't know the details).

So to make the symbols.o file work on ld64, we need to actually insert a relocation. That's kinda cumbersome to do though, since the relocation must be valid, and hence must point to a valid piece of machine code, and is hence very architecture-specific.

Fixes rust-lang#133491, see that for investigation.


Another option would be to pass -u _foo to the final linker invocation. This has the problem that -u causes the linker to not be able to dead-strip the symbol, which is undesirable. (If we did this, we would possibly also want to do it by putting the arguments in a file by itself, and passing that file via @`,` e.g. @undefined_symbols.txt, similar to rust-lang#52699, though that is only supported since Xcode 12, and I'm not sure we wanna bump that).

Various other options that are probably all undesirable as they affect link time performance:

Failed attempt: Embed -u _foo in the object file with LC_LINKER_OPTION, akin to rust-lang#121293. Doesn't work, both because ld64 doesn't read that from archive members unless it already has a reason to load the member (which is what this PR is trying to make it do), and because ld64 only support the -l, -needed-l, -framework and -needed_framework flags in there.


TODO:

@rustbot label O-apple

bors added a commit to rust-lang-ci/rust that referenced this pull request

Feb 24, 2025

@bors

Make #[used] work when linking with ld64

To make #[used] work in static libraries, we use the symbols.o trick introduced in rust-lang#95604.

However, the linker shipped with Xcode, ld64, works a bit differently from other linkers; in particular, it completely ignores undefined symbols by themselves, and only consider them if they have relocations (something something atoms something fixups, I don't know the details).

So to make the symbols.o file work on ld64, we need to actually insert a relocation. That's kinda cumbersome to do though, since the relocation must be valid, and hence must point to a valid piece of machine code, and is hence very architecture-specific.

Fixes rust-lang#133491, see that for investigation.


Another option would be to pass -u _foo to the final linker invocation. This has the problem that -u causes the linker to not be able to dead-strip the symbol, which is undesirable. (If we did this, we would possibly also want to do it by putting the arguments in a file by itself, and passing that file via @`,` e.g. @undefined_symbols.txt, similar to rust-lang#52699, though that is only supported since Xcode 12, and I'm not sure we wanna bump that).

Various other options that are probably all undesirable as they affect link time performance:

Failed attempt: Embed -u _foo in the object file with LC_LINKER_OPTION, akin to rust-lang#121293. Doesn't work, both because ld64 doesn't read that from archive members unless it already has a reason to load the member (which is what this PR is trying to make it do), and because ld64 only support the -l, -needed-l, -framework and -needed_framework flags in there.


TODO:

@rustbot label O-apple

bors added a commit to rust-lang-ci/rust that referenced this pull request

Feb 25, 2025

@bors

Make #[used] work when linking with ld64

To make #[used] work in static libraries, we use the symbols.o trick introduced in rust-lang#95604.

However, the linker shipped with Xcode, ld64, works a bit differently from other linkers; in particular, it completely ignores undefined symbols by themselves, and only consider them if they have relocations (something something atoms something fixups, I don't know the details).

So to make the symbols.o file work on ld64, we need to actually insert a relocation. That's kinda cumbersome to do though, since the relocation must be valid, and hence must point to a valid piece of machine code, and is hence very architecture-specific.

Fixes rust-lang#133491, see that for investigation.


Another option would be to pass -u _foo to the final linker invocation. This has the problem that -u causes the linker to not be able to dead-strip the symbol, which is undesirable. (If we did this, we would possibly also want to do it by putting the arguments in a file by itself, and passing that file via @`,` e.g. @undefined_symbols.txt, similar to rust-lang#52699, though that is only supported since Xcode 12, and I'm not sure we wanna bump that).

Various other options that are probably all undesirable as they affect link time performance:

Failed attempt: Embed -u _foo in the object file with LC_LINKER_OPTION, akin to rust-lang#121293. Doesn't work, both because ld64 doesn't read that from archive members unless it already has a reason to load the member (which is what this PR is trying to make it do), and because ld64 only support the -l, -needed-l, -framework and -needed_framework flags in there.


TODO:

@rustbot label O-apple

BoxyUwU pushed a commit to BoxyUwU/rustc-dev-guide that referenced this pull request

Feb 25, 2025

@bors

Make #[used] work when linking with ld64

To make #[used] work in static libraries, we use the symbols.o trick introduced in rust-lang/rust#95604.

However, the linker shipped with Xcode, ld64, works a bit differently from other linkers; in particular, it completely ignores undefined symbols by themselves, and only consider them if they have relocations (something something atoms something fixups, I don't know the details).

So to make the symbols.o file work on ld64, we need to actually insert a relocation. That's kinda cumbersome to do though, since the relocation must be valid, and hence must point to a valid piece of machine code, and is hence very architecture-specific.

Fixes rust-lang/rust#133491, see that for investigation.


Another option would be to pass -u _foo to the final linker invocation. This has the problem that -u causes the linker to not be able to dead-strip the symbol, which is undesirable. (If we did this, we would possibly also want to do it by putting the arguments in a file by itself, and passing that file via @`,` e.g. @undefined_symbols.txt, similar to rust-lang/rust#52699, though that is only supported since Xcode 12, and I'm not sure we wanna bump that).

Various other options that are probably all undesirable as they affect link time performance:

Failed attempt: Embed -u _foo in the object file with LC_LINKER_OPTION, akin to rust-lang/rust#121293. Doesn't work, both because ld64 doesn't read that from archive members unless it already has a reason to load the member (which is what this PR is trying to make it do), and because ld64 only support the -l, -needed-l, -framework and -needed_framework flags in there.


TODO:

@rustbot label O-apple

github-actions bot pushed a commit to rust-lang/miri that referenced this pull request

Feb 26, 2025

@bors

Make #[used] work when linking with ld64

To make #[used] work in static libraries, we use the symbols.o trick introduced in rust-lang/rust#95604.

However, the linker shipped with Xcode, ld64, works a bit differently from other linkers; in particular, it completely ignores undefined symbols by themselves, and only consider them if they have relocations (something something atoms something fixups, I don't know the details).

So to make the symbols.o file work on ld64, we need to actually insert a relocation. That's kinda cumbersome to do though, since the relocation must be valid, and hence must point to a valid piece of machine code, and is hence very architecture-specific.

Fixes rust-lang/rust#133491, see that for investigation.


Another option would be to pass -u _foo to the final linker invocation. This has the problem that -u causes the linker to not be able to dead-strip the symbol, which is undesirable. (If we did this, we would possibly also want to do it by putting the arguments in a file by itself, and passing that file via @`,` e.g. @undefined_symbols.txt, similar to rust-lang/rust#52699, though that is only supported since Xcode 12, and I'm not sure we wanna bump that).

Various other options that are probably all undesirable as they affect link time performance:

Failed attempt: Embed -u _foo in the object file with LC_LINKER_OPTION, akin to rust-lang/rust#121293. Doesn't work, both because ld64 doesn't read that from archive members unless it already has a reason to load the member (which is what this PR is trying to make it do), and because ld64 only support the -l, -needed-l, -framework and -needed_framework flags in there.


TODO:

@rustbot label O-apple

lnicola pushed a commit to lnicola/rust-analyzer that referenced this pull request

Mar 3, 2025

@bors

Make #[used] work when linking with ld64

To make #[used] work in static libraries, we use the symbols.o trick introduced in rust-lang/rust#95604.

However, the linker shipped with Xcode, ld64, works a bit differently from other linkers; in particular, it completely ignores undefined symbols by themselves, and only consider them if they have relocations (something something atoms something fixups, I don't know the details).

So to make the symbols.o file work on ld64, we need to actually insert a relocation. That's kinda cumbersome to do though, since the relocation must be valid, and hence must point to a valid piece of machine code, and is hence very architecture-specific.

Fixes rust-lang/rust#133491, see that for investigation.


Another option would be to pass -u _foo to the final linker invocation. This has the problem that -u causes the linker to not be able to dead-strip the symbol, which is undesirable. (If we did this, we would possibly also want to do it by putting the arguments in a file by itself, and passing that file via @`,` e.g. @undefined_symbols.txt, similar to rust-lang/rust#52699, though that is only supported since Xcode 12, and I'm not sure we wanna bump that).

Various other options that are probably all undesirable as they affect link time performance:

Failed attempt: Embed -u _foo in the object file with LC_LINKER_OPTION, akin to rust-lang/rust#121293. Doesn't work, both because ld64 doesn't read that from archive members unless it already has a reason to load the member (which is what this PR is trying to make it do), and because ld64 only support the -l, -needed-l, -framework and -needed_framework flags in there.


TODO:

@rustbot label O-apple

Labels

merged-by-bors

This PR was explicitly merged by bors.

perf-regression

Performance regression.

S-waiting-on-bors

Status: Waiting on bors to run and complete tests. Bors will change the label on completion.

T-compiler

Relevant to the compiler team, which will review and decide on the PR/issue.