[LLVMdev] RFC: Proposal for Poison Semantics (original) (raw)

Mehdi Amini mehdi.amini at apple.com
Tue Jan 27 22:42:57 PST 2015


On Jan 27, 2015, at 10:29 PM, David Majnemer <david.majnemer at gmail.com> wrote:

On Tue, Jan 27, 2015 at 10:18 PM, Mehdi Amini <mehdi.amini at apple.com <mailto:mehdi.amini at apple.com>> wrote: > On Jan 27, 2015, at 6:50 PM, David Majnemer <david.majnemer at gmail.com <mailto:david.majnemer at gmail.com>> wrote: > > Hello, > > What follows is my attempt to describe how poison works. Let me know what you think. > > -- > David > > > # LLVM Poison Semantics > > Poison is an LLVM concept which exists solely to enable further optimization of LLVM IR. The exact behavior of poison has been, to say the least, confusing for users, researchers and engineers working with LLVM. > > This document hopes to clear up some of the confusion of poison and hopefully explain why it has its semantics. > > ## A Quick Introduction to Poison > > Let's start with a concrete motivating example in C: > _ _> int isSumGreater(int a, int b) {_ _> return a + b > a;_ _> }_ _> > > The C specification permits us to optimize the comparison in isSumGreater to b > 0 because signed overflow results in undefined behavior. A reasonable translation of isSumGreater to LLVM IR could be: Typo: to b > 0 should be to a + b > 0 IIUC. I mention in case you intend to publish this in the doc. I intended what I wrote. We perform this optimization in InstCombine today: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/InstCombine/InstCombineCompares.cpp?revision=226783&view=markup#l3193 <http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/InstCombine/InstCombineCompares.cpp?revision=226783&view=markup#l3193> Yeah sorry I was completely wrong.

Mehdi

Mehdi

> > _ _> define i32 @isSumGreater(i32 %a, i32 %b) {_ _> entry:_ _> %add = add i32 %a, %b_ _> %cmp = icmp sgt i32 %add, %a_ _> %conv = zext i1 %cmp to i32_ _> ret i32 %conv_ _> }_ _> > > However, LLVM cannot determine that %cmp should not consider cases where %add resulted in signed overflow. We need a way to communicate this information to LLVM. > > This is where the nsw and nuw flags come into play. nsw is short for "no signed wrap", nuw is short for "no unsigned wrap". > > With these, we can come up with a new formulation of %add: add i32 nsw %a, %b. > LLVM can take this into account when it is optimizing the %cmp and replace it with: icmp sgt i32 %b, 0. > > ## Differences Between LLVM and C/C++ > > There are some interesting differences between what C++ and C specify and how LLVM behaves with respect to performing an operationg which is not permitted to overflow. > > Perhaps chief among them is that evaluating an expression in C++ or C which results performs an overflow is undefined behavior. In LLVM, executing an instruction which is marked nsw but which violates signed overflow results in poison. Values which have no relationship with poisoned values are not effected by them. > > Let us take the following C program into consideration: > _ _> int calculateImportantResult(int a, int b) {_ _> int result = 0;_ _> if (a) {_ _> result = a + b;_ _> }_ _> return result;_ _> }_ _> > > A straightforward lowering to LLVM IR could be: > _ _> define i32 @calculateImportantResult(i32 %a, i32 %b) {_ _> entry:_ _> %tobool = icmp ne i32 %a, 0_ _> br i1 %tobool, label %if.then, label %if.end_ _>_ _> if.then:_ _> %add = add nsw i32 %a, %b_ _> br label %if.end_ _>_ _> if.end:_ _> %result = phi i32 [ %add, %if.then ], [ 0, %entry ]_ _> ret i32 %result_ _> }_ _> > > Moving %add to the %entry block would be preferable and would allow further optimizations: > _ _> define i32 @calculateImportantResult(i32 %a, i32 %b) {_ _> entry:_ _> %tobool = icmp ne i32 %a, 0_ _> %add = add nsw i32 %a, %b_ _> %result = select i1 %tobool, i32 0, i32 %add_ _> ret i32 %result_ _> }_ _> > > In the original code, the calculation of %add was control dependent. > Now, %add might result in signed overflow in violation of the nsw flag. > Despite this, the program should behave as it did before because the poisoned value is masked-out by the select. The next section will dive into this in greater detail. > > # Computation Involving Poison Values > Poison in a computation results in poison if the result cannot be constrained by its non-poison operands. > > Examples of this rule which will result in poison: > _ _> %add = add i32 %x, %alwayspoison_ _> %sub = sub i32 %x, %alwayspoison_ _> %xor = xor i32 %x, %alwayspoison_ _> %mul = mul i32 %alwayspoison, 1_ _> > > Examples of this rule which do not result in poison: > _ _> %or = or i32 %alwayspoison, 2_ _> %and = and i32 %alwayspoison, 2_ _> %mul = mul i32 %alwayspoison, 0_ _> > > In fact, it would be reasonable to optimize %or to 2 and %and to 0. In this respect, poison is not different from undef. > > The following example is only poison if %cond is false: > _ _> %sel = select i1 %cond, i32 2, %alwayspoison_ _> > > ### Is it safe to have poison as a call argument? > > A call instruction may or may not result in poison depending on exactly how the callee uses the supplied arguments, it is not necessarily the case that call i32 @someFunction(i32 %alwayspoison) results in poison. > > LLVM cannot forbid poison from entering call arguments without prohibiting an optimization pass from outlining code. > > ### Is it safe to store poison to memory? > > store i32 %alwayspoison, i32* %mem does not result in undefined behavior. A subsequent load instruction like %load = load i32* %mem will result in %load being a poison value. > > ### Is it safe to load or store a poison memory location? > > No. Poison works just like undef in this respect. > > ### Does comparing a poison value result in poison? > > It depends. If the comparison couldn't solely be determined by looking at the other operand, the result is poison. > > For example, icmp i32 ule %alwayspoison, 4294967295 is true, not poison. > However, icmp i32 ne %alwayspoison, 7 is poison. > > ### What if the condition operand in a select is poison? > > In the example %sel = select i1 %alwayspoison, i1 true, false, %sel is either true or false. Because, %sel depends on %alwayspoison it too is poison. _> ________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu> http://llvm.cs.uiuc.edu <http://llvm.cs.uiuc.edu/> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>

-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150127/c69dc6f8/attachment.html>



More information about the llvm-dev mailing list