Add unreachable propagation mir optimization pass by ktrianta · Pull Request #66329 · rust-lang/rust (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation92 Commits1 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

ktrianta

@oli-obk suggested we create a MIR pass that optimizes away basic blocks that lead only to basic blocks with terminator kind unreachable. This is a first take on this, which we started with @gilescope at RustFest Impl Days.

The test currently fails when the compiled program runs (undefined behaviour). Is there a way to avoid running the compiled program?

@rust-highfive

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @cramertj (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@ktrianta

@rust-highfive

The job x86_64-gnu-llvm-6.0 of your PR failed (pretty log, raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.

2019-11-12T13:57:31.3960974Z ##[command]git remote add origin https://github.com/rust-lang/rust
2019-11-12T13:57:31.9687263Z ##[command]git config gc.auto 0
2019-11-12T13:57:31.9693918Z ##[command]git config --get-all http.https://github.com/rust-lang/rust.extraheader
2019-11-12T13:57:31.9698046Z ##[command]git config --get-all http.proxy
2019-11-12T13:57:31.9702250Z ##[command]git -c http.extraheader="AUTHORIZATION: basic ***" fetch --force --tags --prune --progress --no-recurse-submodules --depth=2 origin +refs/heads/*:refs/remotes/origin/* +refs/pull/66329/merge:refs/remotes/pull/66329/merge
---
2019-11-12T14:04:34.1801734Z Done!
2019-11-12T14:04:34.1801990Z some tidy checks failed
2019-11-12T14:04:34.1802028Z 
2019-11-12T14:04:34.1805364Z 
2019-11-12T14:04:34.1806159Z command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-tools-bin/tidy" "/checkout/src" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "--no-vendor"
2019-11-12T14:04:34.1806269Z 
2019-11-12T14:04:34.1806290Z 
2019-11-12T14:04:34.1806329Z failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test src/tools/tidy
2019-11-12T14:04:34.1806401Z Build completed unsuccessfully in 0:01:37
2019-11-12T14:04:34.1806401Z Build completed unsuccessfully in 0:01:37
2019-11-12T14:04:34.1806437Z == clock drift check ==
2019-11-12T14:04:34.1806473Z   local time: Tue Nov 12 14:04:34 UTC 2019
2019-11-12T14:04:34.2611892Z   network time: Tue, 12 Nov 2019 14:04:34 GMT
2019-11-12T14:04:34.2612011Z == end clock drift check ==
2019-11-12T14:04:35.6533061Z 
2019-11-12T14:04:35.6648141Z ##[error]Bash exited with code '1'.
2019-11-12T14:04:35.6702771Z ##[section]Starting: Checkout
2019-11-12T14:04:35.6704585Z ==============================================================================
2019-11-12T14:04:35.6704815Z Task         : Get sources
2019-11-12T14:04:35.6705040Z Description  : Get sources from a repository. Supports Git, TfsVC, and SVN repositories.

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

tesuji

wesleywiser

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice first contribution! 🎉

for block in unreachable_blocks {
body.basic_blocks_mut()[block]
.terminator.as_mut().unwrap().kind = TerminatorKind::Unreachable;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Centril

@ktrianta

Centril

} else {
let mut bb_successors = terminator.successors().peekable();
if bb_successors.peek().is_some() {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, why this condition? Definitionally, a terminator which cannot end up anywhere means that it is unreachable.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean the else branch and that it is redundant as we have the successors check?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all() is true if it's empty alas, so we added a check that it wasn't empty first. Is there a cuter way to say all_but_not_empty()?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean that the unwritten else-branch here could do the same as the if-branch, and so the if bb_successors.peek().is_some() is unnecessary.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gilescope But why do we want to differentiate between the empty and non-empty case? By definition, a branch is only reachable if it has any reachable successors.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never mind; this is wrong since this would transform e.g. Abort to Unreachable (there is a successor of sorts in this case, it's just not encoded in MIR...). Can you add a comment explaining why the check exists?

@hanna-kruppe

I don't believe this optimization is sound as currently implemented because calls can diverge while having only unreachable successors. As an example, consider this program:

fn loop_forever() { loop {} }

pub enum Empty {}

pub unsafe fn foo(x: bool, bomb: *const Empty) { if x { loop_forever() } match *bomb {} }

Leaving aside landing pads that may exist at the point in the pipeline where the pass currently runs (more about that later), the MIR CFG of foo has this shape:

            +-------+
            |bb0:   |
            | ...   |
            | if x  |
            +-+--+--+
        true  |  |
       +------+  |
       |         |
   +---v---+     |false
   |bb1:   |     |
   | ...   |     |
   | call  |     |
   +---+---+     |
       |         |
       +------+  |
              |  |
              v  v
          +-------------+
          |bb3:         |
          | ...         |
          | unreachable |
          +-------------+

As I understand it, the pass will mark all three basic blocks as unreachable (RPO is bb2, bb1, bb0 and when visited in that order bb1 and bb0 have only unreachable successors). But when x == true, this function just diverges without UB, so bb1 and bb0 should not have their terminators replaced with unreachable.

Now, I see that the pass has been placed before NoLandingPads in the MIR optimization pipeline, so maybe the program above won't get miscompiled as there may still be a landing pads that make bb0 and bb1 have non-unreachable successors. I say "maybe" because I'm fuzzy on when landing pads are inserted and eliminated after analysis vs never inserted in the first place.

But in any case, it's very fragile to rely on pass ordering to such an extent, especially if it's not documented anywhere. In addition, there may be statements that can diverge (inline asm is currently an example, though unstable) and similar issues apply there.

So I think this pass needs some analysis to tell whether a basic block is known to reach its successors for sure, and it should be written to be conservatively correct, i.e., err on the side of falsely concluding that a basic block may diverge before reaching its successor. This may be a candidate for a generally useful MIR analysis, but it can also start as a one-off thing local to this pass.

@Centril

As an example, consider this program:

We should definitely add a mir-opt test ensuring this isn't mis-compiled.

In addition, there may be statements that can diverge (inline asm is currently an example, though unstable) and similar issues apply there.

Ugh; that seems ungreat. From what I can tell, inline asm is the only statement that can affect control flow -- there doesn't seem to be anything else. (Why is InlineAsm not a terminator?)

It seems to me that we would be on the safe side by limiting this optimization to SwitchInt terminators.

@hanna-kruppe

In addition, there may be statements that can diverge (inline asm is currently an example, though unstable) and similar issues apply there.

Ugh; that seems ungreat. From what I can tell, inline asm is the only statement that can affect control flow -- there doesn't seem to be anything else. (Why is InlineAsm it not a terminator?)

Note that inline asm can't affect control flow of the surrounding program in the typical sense: if any MIR instruction (statement or terminator) is executed after an inline asm, it is the one that comes next in the same basic block. Inline asm can only diverge by "never finishing" (e.g. by looping internally or terminating the program), not e.g. by returning or unwinding or jumping to a different MIR instructions. It also isn't a natural stopping point for basic blocks the way e.g. Abort or Unreachable are. So it should not be a basic block terminator in any classical sense and wouldn't be in many other compilers.

Inline asm isn't special in this respect, either. There are various operations that might "never finish" but could plausibly be statements (e.g. calls that can't unwind, a variant of Assert that aborts rather than panicking, or checked arithmetic that terminates the program rather than panicking). We don't currently have any such statements, but it seems entirely plausible that we might want to add some of them in the future (e.g. to have fewer and larger basic blocks, or to implement overflow checking with lower overhead).

The only way to justify making these kinds of statements into terminator is if we decide that MIR basic blocks should not only have the classical properties of basic blocks, but furthermore guarantee that execution of the terminator is guaranteed (absent external termination of the program, of course) when the basic block is entered. This seems like a strange property, as it's annoying to maintain (you have to convert things which could otherwise be statements into terminators and eat the costs of the extra basic blocks) and it's only of limited use to a few analyses which can also be written very reasonably in other ways.

It seems to me that we would be on the safe side by limiting this optimization to SwitchInt terminators.

Goto is also an easy one. Personally, I'd write out an exhaustive match terminator, that also provides a great opportunity for comments explaining why other terminators aren't eligible. Of course, none of this solves the problem with diverging statements.

@davidhewitt

Also at RustFest @oli-obk mentored me through a related mir optimisation pass which replaced std::intrinsics::unreachable() with the Unreachable terminator. I didn't open it as a PR because we agreed in the end it might just be better to just handle the unreachable intrinsic as part of the pass proposed here.

The only reason I can potentially think of for keeping the passes separate is if we always want to replace the unreachable() intrinsic with the terminator, but we decide we don't always want to do unreachable propagation? (or vice versa)

The WIP branch for that pass can be found here.

To integrate it into this PR one could do something like the following instead of the existing if let Unreachable = terminator.kind { ... }:

fn is_unreachable_intrinsic<'tcx>(kind: &TerminatorKind<'tcx>) -> bool {
    if let TerminatorKind::Call { func: Operand::Constant(c), .. } = kind {
        if let ty::FnDef(def_id, ..) = c.literal.ty.kind {
            if let Abi::RustIntrinsic = self.tcx.fn_sig(def_id).abi() {
                if let sym::unreachable = self.tcx.item_name(def_id) {
                    return true
                }
            }
        }
    }

    return false
}

// and later on, in the current implementation of this optimisation pass
match terminator.kind {
    TerminatorKind::Call if is_unreachable_intrinsic(&terminator.kind) |
    TerminatorKind::Unreachable => {
        // ... mark block as unreachable
    },
    _ => {
        // ... remove block if all successors unreachable
    }
}

The removal of the codegen for the unreachable intrinsic (in src/librustc_codegen_llvm/intrinsic.rs) and the definition of the unreachable symbol (in src/libsyntax_pos/symbol.rs) from that branch may also need to be copied here.

@gilescope

We could be conservative if we found a block containing inline asm. But diverging statements sounds suspiciously like trying to solve the haulting problem... can we enumerate the problem statement types or would that be too risky?

@hanna-kruppe

It's not hard at all to make a useful classification of statements and terminators into "can possibly diverge" vs "definitely can't diverge" (in contrast, all the classical results of computability theory concern decision problems like "definitely diverges" vs "definitely halts").

@gilescope

Great - I think one of the other optimisations @oli-obk was discussing was classifying statements that were definitely pure r-values so that we could then remove those statements (I assume following an unreachable?). Anything that might diverge would not be a pure r-value?

@JohnCSimon

Ping from triage
@ktrianta can you please address the build failures?
Thank you!

@rust-highfive

The job x86_64-gnu-llvm-6.0 of your PR failed (pretty log, raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.

2019-11-24T10:53:52.9939827Z ##[command]git remote add origin https://github.com/rust-lang/rust
2019-11-24T10:53:52.9955448Z ##[command]git config gc.auto 0
2019-11-24T10:53:52.9959602Z ##[command]git config --get-all http.https://github.com/rust-lang/rust.extraheader
2019-11-24T10:53:52.9963924Z ##[command]git config --get-all http.proxy
2019-11-24T10:53:52.9968077Z ##[command]git -c http.extraheader="AUTHORIZATION: basic ***" fetch --force --tags --prune --progress --no-recurse-submodules --depth=2 origin +refs/heads/*:refs/remotes/origin/* +refs/pull/66329/merge:refs/remotes/pull/66329/merge
---
2019-11-24T10:59:27.9730045Z Found 0 error codes with no tests
2019-11-24T10:59:27.9730285Z Done!
2019-11-24T10:59:27.9730480Z 
2019-11-24T10:59:27.9730961Z 
2019-11-24T10:59:27.9733229Z command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-tools-bin/tidy" "/checkout/src" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "--no-vendor"
2019-11-24T10:59:27.9733803Z 
2019-11-24T10:59:27.9733926Z 
2019-11-24T10:59:27.9734928Z failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test src/tools/tidy
2019-11-24T10:59:27.9735129Z Build completed unsuccessfully in 0:01:23
2019-11-24T10:59:27.9735129Z Build completed unsuccessfully in 0:01:23
2019-11-24T10:59:27.9782867Z == clock drift check ==
2019-11-24T10:59:27.9809005Z   local time: Sun Nov 24 10:59:27 UTC 2019
2019-11-24T10:59:28.2531808Z   network time: Sun, 24 Nov 2019 10:59:28 GMT
2019-11-24T10:59:28.2535305Z == end clock drift check ==
2019-11-24T10:59:29.6092456Z 
2019-11-24T10:59:29.6186459Z ##[error]Bash exited with code '1'.
2019-11-24T10:59:29.6239117Z ##[section]Starting: Checkout
2019-11-24T10:59:29.6240528Z ==============================================================================
2019-11-24T10:59:29.6240569Z Task         : Get sources
2019-11-24T10:59:29.6240620Z Description  : Get sources from a repository. Supports Git, TfsVC, and SVN repositories.

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@ktrianta

@JohnCSimon working on it. Last build was started by accident.

@hdhoang

ping from triage @ktrianta any updates on this? no hurry though since it's the holiday season.

@gilescope

@ktrianta let me know if you want a hand. Wasn't quite sure how one helps in a PR - does one fork your fork and propose a PR (that you could choose to accept into your PR)?

@bjorn3

Yes, that is the way to help with a PR. Just make sure that you create a new branch from the branch of this PR and propose to merge your branch into the branch of this PR, and not master.

@bors

@bors bors added S-waiting-on-review

Status: Awaiting review from the assignee but also interested parties.

and removed S-waiting-on-bors

Status: Waiting on bors to run and complete tests. Bors will change the label on completion.

labels

Jan 14, 2020

@lqd

looks like spurious network errors and crates.io 500/503 errors

@bors retry

@bors bors added S-waiting-on-bors

Status: Waiting on bors to run and complete tests. Bors will change the label on completion.

and removed S-waiting-on-review

Status: Awaiting review from the assignee but also interested parties.

labels

Jan 14, 2020

@bors

⌛ Testing commit 72710d6 with merge 3fe5b3728515aa7232c1198503606092977dad60...

@rust-highfive

The job dist-powerpc64le-linux of your PR failed (pretty log, raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@bors

@bors bors added S-waiting-on-review

Status: Awaiting review from the assignee but also interested parties.

and removed S-waiting-on-bors

Status: Waiting on bors to run and complete tests. Bors will change the label on completion.

labels

Jan 14, 2020

@mati865

2020-01-14T15:22:49.2560995Z warning: spurious network error (2 tries remaining): failed to get 200 response from `https://crates.io/api/v1/crates/string_cache_codegen/0.4.2/download`, got 503
2020-01-14T15:22:49.2670705Z warning: spurious network error (2 tries remaining): failed to get 200 response from `https://crates.io/api/v1/crates/rls-vfs/0.8.0/download`, got 503
2020-01-14T15:22:49.3234193Z warning: spurious network error (1 tries remaining): failed to get 200 response from `https://crates.io/api/v1/crates/rls-vfs/0.8.0/download`, got 503
2020-01-14T15:22:49.3247165Z warning: spurious network error (1 tries remaining): failed to get 200 response from `https://crates.io/api/v1/crates/string_cache_codegen/0.4.2/download`, got 503 

@oli-obk

@bors bors added S-waiting-on-bors

Status: Waiting on bors to run and complete tests. Bors will change the label on completion.

and removed S-waiting-on-review

Status: Awaiting review from the assignee but also interested parties.

labels

Jan 14, 2020

Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request

Jan 14, 2020

@Dylan-DPC

…ation, r=oli-obk

Add unreachable propagation mir optimization pass

@oli-obk suggested we create a MIR pass that optimizes away basic blocks that lead only to basic blocks with terminator kind unreachable. This is a first take on this, which we started with @gilescope at RustFest Impl Days.

The test currently fails when the compiled program runs (undefined behaviour). Is there a way to avoid running the compiled program?

@bors

bors added a commit that referenced this pull request

Jan 15, 2020

@bors

…i-obk

Add unreachable propagation mir optimization pass

@oli-obk suggested we create a MIR pass that optimizes away basic blocks that lead only to basic blocks with terminator kind unreachable. This is a first take on this, which we started with @gilescope at RustFest Impl Days.

The test currently fails when the compiled program runs (undefined behaviour). Is there a way to avoid running the compiled program?

@bors

@bors bors mentioned this pull request

Jan 15, 2020

Reviewers

@oli-obk oli-obk oli-obk requested changes

@gilescope gilescope gilescope left review comments

@wesleywiser wesleywiser wesleywiser left review comments

@Centril Centril Centril left review comments

@tesuji tesuji tesuji left review comments

@bjorn3 bjorn3 bjorn3 left review comments

Labels

merged-by-bors

This PR was explicitly merged by bors.

S-waiting-on-bors

Status: Waiting on bors to run and complete tests. Bors will change the label on completion.