coverage: Emit the filenames section before encoding per-function mappings by Zalathar · Pull Request #117042 · rust-lang/rust (original) (raw)
rustbot added S-waiting-on-review
Status: Awaiting review from the assignee but also interested parties.
Relevant to the compiler team, which will review and decide on the PR/issue.
labels
…hases
This gives us a clearly-defined place to run code after the instance's MIR has
been traversed by codegen, but before we emit its __llvm_covfun
record.
The combined get_expressions_and_counter_regions
method was an artifact of
having to prepare the expressions and mappings at the same time, to avoid
ownership/lifetime problems with temporary data used by both.
Now that we have an explicit transition from FunctionCoverageCollector
to the
final FunctionCoverage
, we can prepare any shared data during that step and
store it in the final struct.
The main change here is that VirtualFileMapping
now uses an internal hashmap
to de-duplicate incoming global file IDs. That removes the need for
encode_mappings_for_function
to re-sort its mappings by filename in order to
de-duplicate them.
(We still de-duplicate runs of identical filenames to save work, but this is not load-bearing for correctness, so a sort is not necessary.)
bors added S-waiting-on-bors
Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
and removed S-waiting-on-review
Status: Awaiting review from the assignee but also interested parties.
labels
bors added a commit to rust-lang-ci/rust that referenced this pull request
coverage: Emit the filenames section before encoding per-function mappings
When embedding coverage information in LLVM IR (and ultimately in the resulting binary), there are two main things that each CGU needs to emit:
- A single
__llvm_covmap
record containing a coverage header, which mostly consists of a list of filenames used by the CGU's coverage mappings. - Several
__llvm_covfun
records, one for each instrumented function, each of which contains the hash of the list of filenames in the header.
There is a kind of loose cyclic dependency between the two: we need the hash of the file table before we can emit the covfun records, but we need to traverse all of the instrumented functions in order to build the file table.
The existing code works by processing the individual functions first. It lazily adds filenames to the file table, and stores the mostly-complete function records in a temporary list. After this it hashes the file table, emits the header (containing the file table), and then uses the hash to emit all of the function records.
This PR reverses that order: first we traverse all of the functions (without trying to prepare their function records) to build a complete file table, and then emit it immediately. At this point we have the file table hash, so we can then proceed to build and emit all of the function records, without needing to store them in an intermediate list.
Along the way, this PR makes some necessary changes that are also worthwhile in their own right:
- We split
FunctionCoverage
into distinct collector/finished phases, which neatly avoids some borrow-checker hassles when extracting a function's final expression/mapping data. - We avoid having to re-sort a function's mappings when preparing the list of filenames that it uses.
bors added S-waiting-on-review
Status: Awaiting review from the assignee but also interested parties.
and removed S-waiting-on-bors
Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
labels
…pings
Most coverage metadata is encoded into two sections in the final executable.
The __llvm_covmap
section mostly just contains a list of filenames, while the
__llvm_covfun
section contains encoded coverage maps for each instrumented
function.
The catch is that each per-function record also needs to contain a hash of the filenames list that it refers to. Historically this was handled by assembling most of the per-function data into a temporary list, then assembling the filenames buffer, then using the filenames hash to emit the per-function data, and then finally emitting the filenames table itself.
However, now that we build the filenames table up-front (via a separate traversal of the per-function data), we can hash and emit that part first, and then emit each of the per-function records immediately after building. This removes the awkwardness of having to temporarily store nearly-complete per-function records.
bors added S-waiting-on-bors
Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
and removed S-waiting-on-review
Status: Awaiting review from the assignee but also interested parties.
labels
bors added a commit to rust-lang-ci/rust that referenced this pull request
…iaskrgr
Rollup of 6 pull requests
Successful merges:
- rust-lang#105666 (rustdoc: align stability badge to baseline instead of bottom)
- rust-lang#117042 (coverage: Emit the filenames section before encoding per-function mappings)
- rust-lang#117044 (Miri subtree update)
- rust-lang#117049 (add a
csky-unknown-linux-gnuabiv2hf
target ) - rust-lang#117051 (fix broken link: update incremental compilation url)
- rust-lang#117069 (x.ps1: remove the check for Python from Windows Store)
r? @ghost
@rustbot
modify labels: rollup
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request
Rollup merge of rust-lang#117042 - Zalathar:file-table, r=cjgillot
coverage: Emit the filenames section before encoding per-function mappings
When embedding coverage information in LLVM IR (and ultimately in the resulting binary), there are two main things that each CGU needs to emit:
- A single
__llvm_covmap
record containing a coverage header, which mostly consists of a list of filenames used by the CGU's coverage mappings. - Several
__llvm_covfun
records, one for each instrumented function, each of which contains the hash of the list of filenames in the header.
There is a kind of loose cyclic dependency between the two: we need the hash of the file table before we can emit the covfun records, but we need to traverse all of the instrumented functions in order to build the file table.
The existing code works by processing the individual functions first. It lazily adds filenames to the file table, and stores the mostly-complete function records in a temporary list. After this it hashes the file table, emits the header (containing the file table), and then uses the hash to emit all of the function records.
This PR reverses that order: first we traverse all of the functions (without trying to prepare their function records) to build a complete file table, and then emit it immediately. At this point we have the file table hash, so we can then proceed to build and emit all of the function records, without needing to store them in an intermediate list.
Along the way, this PR makes some necessary changes that are also worthwhile in their own right:
- We split
FunctionCoverage
into distinct collector/finished phases, which neatly avoids some borrow-checker hassles when extracting a function's final expression/mapping data. - We avoid having to re-sort a function's mappings when preparing the list of filenames that it uses.
bors-ferrocene bot added a commit to ferrocene/ferrocene that referenced this pull request
bors-ferrocene bot added a commit to ferrocene/ferrocene that referenced this pull request
jieyouxu added a commit to jieyouxu/rust that referenced this pull request
…youxu
coverage: Dismantle map_data.rs
by moving its responsibilities elsewhere
This is a series of incremental changes that combine to let us get rid of coverageinfo/map_data.rs
, by moving all of its responsibilities into more appropriate places.
Some of the notable consequences are:
- We once again build the per-CGU file table on the fly while preparing individual covfun records, instead of building the whole table up-front. The up-front approach was introduced by rust-lang#117042 to work around various other problems in generating the covmap/covfun records, but subsequent cleanups have made that approach no longer necessary.
- Expression conversion and mapping-region conversion are now performed directly in
mapgen::covfun
, which should make future changes easier. - We no longer insert unused function instances into the same map that is also used to track used function instances. This helps to decouple the handling of used vs unused functions.
There should be no meaningful change to compiler output. The file table is no longer sorted, because reordering it would invalidate the file indices stored in individual covfun records, but the table order should still be deterministic (albeit arbitrary).
There are some subsequent cleanups that I intend to investigate, but this is enough change for one PR.
jieyouxu added a commit to jieyouxu/rust that referenced this pull request
…youxu
coverage: Dismantle map_data.rs
by moving its responsibilities elsewhere
This is a series of incremental changes that combine to let us get rid of coverageinfo/map_data.rs
, by moving all of its responsibilities into more appropriate places.
Some of the notable consequences are:
- We once again build the per-CGU file table on the fly while preparing individual covfun records, instead of building the whole table up-front. The up-front approach was introduced by rust-lang#117042 to work around various other problems in generating the covmap/covfun records, but subsequent cleanups have made that approach no longer necessary.
- Expression conversion and mapping-region conversion are now performed directly in
mapgen::covfun
, which should make future changes easier. - We no longer insert unused function instances into the same map that is also used to track used function instances. This helps to decouple the handling of used vs unused functions.
There should be no meaningful change to compiler output. The file table is no longer sorted, because reordering it would invalidate the file indices stored in individual covfun records, but the table order should still be deterministic (albeit arbitrary).
There are some subsequent cleanups that I intend to investigate, but this is enough change for one PR.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request
…youxu
coverage: Dismantle map_data.rs
by moving its responsibilities elsewhere
This is a series of incremental changes that combine to let us get rid of coverageinfo/map_data.rs
, by moving all of its responsibilities into more appropriate places.
Some of the notable consequences are:
- We once again build the per-CGU file table on the fly while preparing individual covfun records, instead of building the whole table up-front. The up-front approach was introduced by rust-lang#117042 to work around various other problems in generating the covmap/covfun records, but subsequent cleanups have made that approach no longer necessary.
- Expression conversion and mapping-region conversion are now performed directly in
mapgen::covfun
, which should make future changes easier. - We no longer insert unused function instances into the same map that is also used to track used function instances. This helps to decouple the handling of used vs unused functions.
There should be no meaningful change to compiler output. The file table is no longer sorted, because reordering it would invalidate the file indices stored in individual covfun records, but the table order should still be deterministic (albeit arbitrary).
There are some subsequent cleanups that I intend to investigate, but this is enough change for one PR.