coverage: Emit the filenames section before encoding per-function mappings by Zalathar · Pull Request #117042 · rust-lang/rust (original) (raw)

Zalathar

@rustbot rustbot added S-waiting-on-review

Status: Awaiting review from the assignee but also interested parties.

T-compiler

Relevant to the compiler team, which will review and decide on the PR/issue.

labels

Oct 22, 2023

cjgillot

@Zalathar

…hases

This gives us a clearly-defined place to run code after the instance's MIR has been traversed by codegen, but before we emit its __llvm_covfun record.

@Zalathar

The combined get_expressions_and_counter_regions method was an artifact of having to prepare the expressions and mappings at the same time, to avoid ownership/lifetime problems with temporary data used by both.

Now that we have an explicit transition from FunctionCoverageCollector to the final FunctionCoverage, we can prepare any shared data during that step and store it in the final struct.

@Zalathar

@Zalathar

@Zalathar

The main change here is that VirtualFileMapping now uses an internal hashmap to de-duplicate incoming global file IDs. That removes the need for encode_mappings_for_function to re-sort its mappings by filename in order to de-duplicate them.

(We still de-duplicate runs of identical filenames to save work, but this is not load-bearing for correctness, so a sort is not necessary.)

@bors bors added S-waiting-on-bors

Status: Waiting on bors to run and complete tests. Bors will change the label on completion.

and removed S-waiting-on-review

Status: Awaiting review from the assignee but also interested parties.

labels

Oct 22, 2023

bors added a commit to rust-lang-ci/rust that referenced this pull request

Oct 22, 2023

@bors

coverage: Emit the filenames section before encoding per-function mappings

When embedding coverage information in LLVM IR (and ultimately in the resulting binary), there are two main things that each CGU needs to emit:

There is a kind of loose cyclic dependency between the two: we need the hash of the file table before we can emit the covfun records, but we need to traverse all of the instrumented functions in order to build the file table.

The existing code works by processing the individual functions first. It lazily adds filenames to the file table, and stores the mostly-complete function records in a temporary list. After this it hashes the file table, emits the header (containing the file table), and then uses the hash to emit all of the function records.

This PR reverses that order: first we traverse all of the functions (without trying to prepare their function records) to build a complete file table, and then emit it immediately. At this point we have the file table hash, so we can then proceed to build and emit all of the function records, without needing to store them in an intermediate list.


Along the way, this PR makes some necessary changes that are also worthwhile in their own right:

@bors bors added S-waiting-on-review

Status: Awaiting review from the assignee but also interested parties.

and removed S-waiting-on-bors

Status: Waiting on bors to run and complete tests. Bors will change the label on completion.

labels

Oct 22, 2023

@Zalathar

…pings

Most coverage metadata is encoded into two sections in the final executable. The __llvm_covmap section mostly just contains a list of filenames, while the __llvm_covfun section contains encoded coverage maps for each instrumented function.

The catch is that each per-function record also needs to contain a hash of the filenames list that it refers to. Historically this was handled by assembling most of the per-function data into a temporary list, then assembling the filenames buffer, then using the filenames hash to emit the per-function data, and then finally emitting the filenames table itself.

However, now that we build the filenames table up-front (via a separate traversal of the per-function data), we can hash and emit that part first, and then emit each of the per-function records immediately after building. This removes the awkwardness of having to temporarily store nearly-complete per-function records.

@bors bors added S-waiting-on-bors

Status: Waiting on bors to run and complete tests. Bors will change the label on completion.

and removed S-waiting-on-review

Status: Awaiting review from the assignee but also interested parties.

labels

Oct 22, 2023

bors added a commit to rust-lang-ci/rust that referenced this pull request

Oct 23, 2023

@bors

…iaskrgr

Rollup of 6 pull requests

Successful merges:

r? @ghost @rustbot modify labels: rollup

rust-timer added a commit to rust-lang-ci/rust that referenced this pull request

Oct 23, 2023

@rust-timer

Rollup merge of rust-lang#117042 - Zalathar:file-table, r=cjgillot

coverage: Emit the filenames section before encoding per-function mappings

When embedding coverage information in LLVM IR (and ultimately in the resulting binary), there are two main things that each CGU needs to emit:

There is a kind of loose cyclic dependency between the two: we need the hash of the file table before we can emit the covfun records, but we need to traverse all of the instrumented functions in order to build the file table.

The existing code works by processing the individual functions first. It lazily adds filenames to the file table, and stores the mostly-complete function records in a temporary list. After this it hashes the file table, emits the header (containing the file table), and then uses the hash to emit all of the function records.

This PR reverses that order: first we traverse all of the functions (without trying to prepare their function records) to build a complete file table, and then emit it immediately. At this point we have the file table hash, so we can then proceed to build and emit all of the function records, without needing to store them in an intermediate list.


Along the way, this PR makes some necessary changes that are also worthwhile in their own right:

bors-ferrocene bot added a commit to ferrocene/ferrocene that referenced this pull request

Oct 31, 2023

bors-ferrocene bot added a commit to ferrocene/ferrocene that referenced this pull request

Oct 31, 2023

jieyouxu added a commit to jieyouxu/rust that referenced this pull request

Dec 17, 2024

@jieyouxu

…youxu

coverage: Dismantle map_data.rs by moving its responsibilities elsewhere

This is a series of incremental changes that combine to let us get rid of coverageinfo/map_data.rs, by moving all of its responsibilities into more appropriate places.

Some of the notable consequences are:


There should be no meaningful change to compiler output. The file table is no longer sorted, because reordering it would invalidate the file indices stored in individual covfun records, but the table order should still be deterministic (albeit arbitrary).

There are some subsequent cleanups that I intend to investigate, but this is enough change for one PR.

jieyouxu added a commit to jieyouxu/rust that referenced this pull request

Dec 17, 2024

@jieyouxu

…youxu

coverage: Dismantle map_data.rs by moving its responsibilities elsewhere

This is a series of incremental changes that combine to let us get rid of coverageinfo/map_data.rs, by moving all of its responsibilities into more appropriate places.

Some of the notable consequences are:


There should be no meaningful change to compiler output. The file table is no longer sorted, because reordering it would invalidate the file indices stored in individual covfun records, but the table order should still be deterministic (albeit arbitrary).

There are some subsequent cleanups that I intend to investigate, but this is enough change for one PR.

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request

Dec 17, 2024

@matthiaskrgr

…youxu

coverage: Dismantle map_data.rs by moving its responsibilities elsewhere

This is a series of incremental changes that combine to let us get rid of coverageinfo/map_data.rs, by moving all of its responsibilities into more appropriate places.

Some of the notable consequences are:


There should be no meaningful change to compiler output. The file table is no longer sorted, because reordering it would invalidate the file indices stored in individual covfun records, but the table order should still be deterministic (albeit arbitrary).

There are some subsequent cleanups that I intend to investigate, but this is enough change for one PR.