Premature return statement inside if statement body (original) (raw)

I am a currently using LLVM to create a compiler for a C-like, imperative language. The code generation is pretty straightforward, I simply visit the Abstract Syntax Tree to emit IR.

if statements are implemented this way:

br i1 %something, label %if.then, label %if.else

if.then:
  ...
  br label %if.end

if.else:
  ...
  br label %if.end

if.end:
  ...

And return statements just emit ret i64 %something.

But this leads to invalid IR. When I have a return statement inside an if statement, I end up with ret i64 %something followed by br label %if.end. But a building block can only contain one terminator instruction at the end.

I know I could just remove the br label %if.end that is after the ret but in my current implementation of the compiler, the if statement and the statements inside the if body are not “aware” of each other, so they cannot conditionally emit IR based of the value of one another (because the if statement naively emit the statements inside its body).

One other thing I find strange is that clang compiles the invalid IR without even complaining, and the resulting machine code seems to be perfectly fine. However, when I call the verify pass on the resulting IR module, I indeed get the error about mutliple terminators inside of building blocks.

How do I produce valid IR in this case while keeping my code generation relatively simple?

kparzysz June 14, 2024, 12:47pm 2

After you insert return you can simply split the basic block at the position immediately following the return. The remaining part will be unreachable, and it will be eventually removed.

pogo59 June 14, 2024, 12:59pm 3

But the if-generation would have to be aware of the split, meaning the codgen for the then-part would have to communicate something back to the caller. Might as well just return a “I’ve already terminated the block” flag.

This communication presumably needs to happen anyway to handle other nested control flow (like nested ifs, or a loop inside of the then block).

Thanks everybody for the help. I ended up communicating the “I’ve terminated the block” to the caller, it seems to work fine.