Implement a file-path remapping feature in support of debuginfo and reproducible builds by michaelwoerister · Pull Request #41508 · rust-lang/rust (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation55 Commits5 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

michaelwoerister

This PR adds the -Zremap-path-prefix-from/-Zremap-path-prefix-to commandline option pair and is a more general implementation of #41419. As opposed to the previous attempt, this implementation should enable reproducible builds regardless of the working directory of the compiler.

This implementation of the feature is more general in the sense that the re-mapping will affect all paths the compiler emits, including the ones in error messages.

r? @alexcrichton

This was referenced

Apr 24, 2017

nagisa

FilePathMapping::new(
self.debugging_opts.remap_path_prefix_from.iter().zip(
self.debugging_opts.remap_path_prefix_to.iter()
).map(|(src, dst)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a potentially confusing CLUI to me. At this point I would simply make -Z remap-path take two arguments… obviously not something our argument parser can do.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I know, it's not perfect. There's been quite some discussion about the CLUI in the initial issue (#38322). This seemed to be an acceptable compromise. It's an unstable feature at the moment. If our parser had some improvements before this is stabilized, we could certainly revisit this.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that the values are paths, and most file systems do not like colon ':' in path names, can't a single argument be bifurcated on a colon? Example: -Z remap-path="original_path_from:preferred_path_to".

@alexcrichton

Looks great to me! Was it intentional to remove the same-number-arguments validation? (checking that the same number of -to as -from was supplied)

@michaelwoerister

Was it intentional to remove the same-number-arguments validation?

Good catch! I'll add it back in tomorrow.

@arielb1 arielb1 added the relnotes

Marks issues that should be documented in the release notes of the next release.

label

Apr 25, 2017

@michaelwoerister

@rust-lang/docs, is there a place where a not-yet-stable feature like this can be documented?

@nagisa

@frewsxcv

@rust-lang/docs, is there a place where a not-yet-stable feature like this can be documented?

Yep! Just add a new file to this directory and link to it in the 'compiler flags' section of the summary file.

@frewsxcv

@michaelwoerister

OK, I added some documentation to the unstable book and made the codegen test check more things.
r? @alexcrichton

@bors

☔ The latest upstream changes (presumably #41504) made this pull request unmergeable. Please resolve the merge conflicts.

alexcrichton

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a few nits, otherwise r=me

@@ -2,6 +2,7 @@
- [Compiler flags](compiler-flags.md)
- [linker_flavor](compiler-flags/linker-flavor.md)
- [linker_flavor](compiler-flags/remap-path-prefix.md)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/linker_flavor/remap_path_prefix/

@@ -0,0 +1,37 @@
# `remap-path-prefix`
The tracking issue for this feature is: None

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mind opening a tracking issue for this?

@michaelwoerister

@michaelwoerister

@bors r=alexcrichton

Nits fixed, I think.

@bors

📌 Commit ab9691d has been approved by alexcrichton

@aidanhs aidanhs added the S-waiting-on-bors

Status: Waiting on bors to run and complete tests. Bors will change the label on completion.

label

Apr 26, 2017

@jsgf

Thanks for turning this around @michaelwoerister! I hope to try this out soon.

BTW, assuming this gets stabilized in more or less the current form, what would the stable command-line options look like?

jsgf

```text
rustc -Zremap-path-prefix-from="/home/foo/my-project/src" -Zremap-path-prefix-to="/sources/my-project"
```

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unclear from this how the remapping is actually operating. It should also answer:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good points. I'll provide some more documentation in a follow-up PR. The implementation as it is does as little as possible: (1) Take each string as it is provided to the compiler, (2) don't do any normalization, (3) do a simple string-level prefix replacement.

jsgf

When the options are given multiple times, the nth `-from` will be matched up
with the nth `-to` and they can appear anywhere on the commandline. Mappings
specified later on the line will take precedence over earlier ones.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does

Mappings specified later on the line will take precedence over earlier ones.

actually mean? Do you mean that if the -Zremap-path-prefix-from is identical it is replaced? Or does it more generally apply, such that a more specific path overrides a more general one?

For example, what's the mapping for:

rustc -Zremap-path-prefix-from=my-project/src -Zremap-path-prefix-to=/sources/my-project \
      -Zremap-path-prefix-from=my-project/src/vendored/otherproject -Zremap-path-prefix-to=/sources/otherproject

when applied to my-project/src/vendored/otherproject/src/lib.rs?

In other words, does the ordering of the -from options matter when multiple mappings could apply? Or is there some other way to disambiguate multiple matching mappings?

Is it guaranteed that at most a single mapping can be applied, or is there some notion of repeated application of mappings until nothing matches?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if defining the matching as:

  1. later remap-path-prefix-from replaces a previous identical one
  2. mappings are matched from longest from prefix to shortest

would be better.

I think this would make it deterministically always select more specific mappings over more general ones, rather than relying purely on ordering.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would differ from the current behaviour in the example a/b -> y, a -> x. I'm not sure which is better, but without a concrete argument that the new one is better I'd suggest to stick with the current behaviour which is simple to describe and implement. The other behaviour would require a trie-like data structure or other more complex thing.

I'm trying to standardise this behaviour across multiple compilers and GCC is already doing something very similar based on the ordering of command-line flags (and I have a patch pending to make it exactly match the behaviour being proposed in this PR). Changing this in the suggested way would make the spec for this standard even more complex. :(

In real-world cases I don't think the issue will crop up, in rustc or anywhere else - it's unlikely that a child process inheriting a prefix-map from a parent process would want to add a less-specific mapping for a higher-level directory.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation is very simple: Walk all from/to pairs as provided on the commandline, starting at the last, and just stop at the first match.

In my opinion, relying only on ordering is a good approach, since the rules are clear and without surprises.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current behaviour allows you to specify useless mappings (by putting more general before more specific in the reverse ordering), which might be inadvertent or unexpected.

I don't think it would require very fancy structures; I think you could implement it pretty simply using BTreeMap<String, String> instead of Vec<String, String>. It would be nice to have an algorithm that can have more efficient implementations (ie, not linear search), to cope with the case where there are lots of mappings (but that's definitely not a requirement for the first implementation).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michaelwoerister

In my opinion, relying only on ordering is a good approach, since the rules are clear and without surprises

I think that's fine - I think the wording could do with clarification (ie, I managed to misinterpret what it meant until I looked at the source).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted. I'll provide an update in a subsequent PR and make sure to ping the usual suspects for review.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the ordering of flags for the lookup allows easy overriding of previous flags without needing to have a way to remove items from an accumulated list of flags (ex: presuming a variable in some higher level build env is accumulating RUST_FLAGS, using ordering means there is no need to parse all the accumulated flags to figure out which ones to remove when adding a higher priority remapping).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, that might be useful to have anyway, since an identity mapping is semantically different from no mapping at all.

@codyps

As this remaps all paths, will this also resolve #40552 ? ie: does it affect the expansion of file!()?

@jsgf

@michaelwoerister

@TimNN Thanks for the tip! But the problem is already caused by the arguments passed to the compiler, I think.

@michaelwoerister

But the problem is already caused by the arguments passed to the compiler, I think.

More specifically, the test case tells the compiler to replace something Unix-specific here:

-Zremap-path-prefix-from={{src-base}}/remap_path_prefix/auxiliary

But on Windows the compiler will see the equivalent of

{{src-base}}\remap_path_prefix\auxiliary
            ^                 ^             

😭

@alexcrichton

@michaelwoerister I'd also consider it ok to ignore the test on windows, it seems like we may not benefit much from the extra coverage there

@alexcrichton alexcrichton added S-waiting-on-author

Status: This is awaiting some action (such as code changes or more information) from the author.

and removed S-waiting-on-bors

Status: Waiting on bors to run and complete tests. Bors will change the label on completion.

labels

Apr 27, 2017

@michaelwoerister

@michaelwoerister

@bors r=alexcrichton

OK, since we are only doing string replacement there should be nothing platform-specific in there atm, so I
disabled the test on Windows. Thanks, @alexcrichton!

@bors

📌 Commit 8ea050d has been approved by alexcrichton

@bors

⌛ Testing commit 8ea050d with merge 9339127...

@bors

@TimNN

@bors

bors added a commit that referenced this pull request

Apr 28, 2017

@bors

…xcrichton

Implement a file-path remapping feature in support of debuginfo and reproducible builds

This PR adds the -Zremap-path-prefix-from/-Zremap-path-prefix-to commandline option pair and is a more general implementation of #41419. As opposed to the previous attempt, this implementation should enable reproducible builds regardless of the working directory of the compiler.

This implementation of the feature is more general in the sense that the re-mapping will affect all paths the compiler emits, including the ones in error messages.

r? @alexcrichton

@bors

@bors bors mentioned this pull request

Apr 28, 2017

kevinmehall added a commit to kevinmehall/rust-peg that referenced this pull request

Apr 29, 2017

@kevinmehall

@cbeuw cbeuw mentioned this pull request

Apr 3, 2021

bors added a commit to rust-lang-ci/rust that referenced this pull request

May 12, 2021

@bors

Fix --remap-path-prefix not correctly remapping rust-src component paths and unify handling of path mapping with virtualized paths

This PR fixes rust-lang#73167 ("Binaries end up containing path to the rust-src component despite --remap-path-prefix") by preventing real local filesystem paths from reaching compilation output if the path is supposed to be remapped.

RealFileName::Named introduced in rust-lang#72767 is now renamed as LocalPath, because this variant wraps a (most likely) valid local filesystem path.

RealFileName::Devirtualized is renamed as Remapped to be used for remapped path from a real path via --remap-path-prefix argument, as well as real path inferred from a virtualized (during compiler bootstrapping) /rustc/... path. The local_path field is now an Option<PathBuf>, as it will be set to None before serialisation, so it never reaches any build output. Attempting to serialise a non-None local_path will cause an assertion faliure.

When a path is remapped, a RealFileName::Remapped variant is created. The original path is preserved in local_path field and the remapped path is saved in virtual_name field. Previously, the local_path is directly modified which goes against its purpose of "suitable for reading from the file system on the local host".

rustc_span::SourceFile's fields unmapped_path (introduced by rust-lang#44940) and name_was_remapped (introduced by rust-lang#41508 when --remap-path-prefix feature originally added) are removed, as these two pieces of information can be inferred from the name field: if it's anything other than a FileName::Real(_), or if it is a FileName::Real(RealFileName::LocalPath(_)), then clearly name_was_remapped would've been false and unmapped_path would've been None. If it is a FileName::Real(RealFileName::Remapped{local_path, virtual_name}), then name_was_remapped would've been true and unmapped_path would've been Some(local_path).

cc @eddyb who implemented /rustc/... path devirtualisation

flip1995 pushed a commit to flip1995/rust that referenced this pull request

May 20, 2021

@bors

Fix --remap-path-prefix not correctly remapping rust-src component paths and unify handling of path mapping with virtualized paths

This PR fixes rust-lang#73167 ("Binaries end up containing path to the rust-src component despite --remap-path-prefix") by preventing real local filesystem paths from reaching compilation output if the path is supposed to be remapped.

RealFileName::Named introduced in rust-lang#72767 is now renamed as LocalPath, because this variant wraps a (most likely) valid local filesystem path.

RealFileName::Devirtualized is renamed as Remapped to be used for remapped path from a real path via --remap-path-prefix argument, as well as real path inferred from a virtualized (during compiler bootstrapping) /rustc/... path. The local_path field is now an Option<PathBuf>, as it will be set to None before serialisation, so it never reaches any build output. Attempting to serialise a non-None local_path will cause an assertion faliure.

When a path is remapped, a RealFileName::Remapped variant is created. The original path is preserved in local_path field and the remapped path is saved in virtual_name field. Previously, the local_path is directly modified which goes against its purpose of "suitable for reading from the file system on the local host".

rustc_span::SourceFile's fields unmapped_path (introduced by rust-lang#44940) and name_was_remapped (introduced by rust-lang#41508 when --remap-path-prefix feature originally added) are removed, as these two pieces of information can be inferred from the name field: if it's anything other than a FileName::Real(_), or if it is a FileName::Real(RealFileName::LocalPath(_)), then clearly name_was_remapped would've been false and unmapped_path would've been None. If it is a FileName::Real(RealFileName::Remapped{local_path, virtual_name}), then name_was_remapped would've been true and unmapped_path would've been Some(local_path).

cc @eddyb who implemented /rustc/... path devirtualisation

bjorn3 pushed a commit to bjorn3/rust that referenced this pull request

May 27, 2021

@bors

Fix --remap-path-prefix not correctly remapping rust-src component paths and unify handling of path mapping with virtualized paths

This PR fixes rust-lang#73167 ("Binaries end up containing path to the rust-src component despite --remap-path-prefix") by preventing real local filesystem paths from reaching compilation output if the path is supposed to be remapped.

RealFileName::Named introduced in rust-lang#72767 is now renamed as LocalPath, because this variant wraps a (most likely) valid local filesystem path.

RealFileName::Devirtualized is renamed as Remapped to be used for remapped path from a real path via --remap-path-prefix argument, as well as real path inferred from a virtualized (during compiler bootstrapping) /rustc/... path. The local_path field is now an Option<PathBuf>, as it will be set to None before serialisation, so it never reaches any build output. Attempting to serialise a non-None local_path will cause an assertion faliure.

When a path is remapped, a RealFileName::Remapped variant is created. The original path is preserved in local_path field and the remapped path is saved in virtual_name field. Previously, the local_path is directly modified which goes against its purpose of "suitable for reading from the file system on the local host".

rustc_span::SourceFile's fields unmapped_path (introduced by rust-lang#44940) and name_was_remapped (introduced by rust-lang#41508 when --remap-path-prefix feature originally added) are removed, as these two pieces of information can be inferred from the name field: if it's anything other than a FileName::Real(_), or if it is a FileName::Real(RealFileName::LocalPath(_)), then clearly name_was_remapped would've been false and unmapped_path would've been None. If it is a FileName::Real(RealFileName::Remapped{local_path, virtual_name}), then name_was_remapped would've been true and unmapped_path would've been Some(local_path).

cc @eddyb who implemented /rustc/... path devirtualisation

@jsgf jsgf mentioned this pull request

Aug 19, 2021

@ehuss ehuss mentioned this pull request

May 9, 2022

Reviewers

@alexcrichton alexcrichton alexcrichton left review comments

@infinity0 infinity0 infinity0 left review comments

@codyps codyps codyps left review comments

@jsgf jsgf jsgf left review comments

@nagisa nagisa nagisa left review comments

@whoisj whoisj whoisj left review comments

Labels

relnotes

Marks issues that should be documented in the release notes of the next release.

S-waiting-on-author

Status: This is awaiting some action (such as code changes or more information) from the author.