[LLVMdev] Adding masked vector load and store intrinsics (original) (raw)

Smith, Kevin B kevin.b.smith at intel.com
Fri Oct 24 12:40:52 PDT 2014


How would one express such semantics in LLVM IR with this intrinsic? By definition, %data anmd %passthrough are different IR virtual registers and there are no copy instructions in LLVM IR.

You never need to express this semantic in LLVM IR, because in SSA form they are always different SSA defs for the result of the operation versus the inputs to the operation. Someplace late in the CG needs to handle this, in exactly an analogous fashion as it already has to handle this for mapping to regular X86 two address code.

For example, this LLVM IR

%add = add nsw i32 %b, %a

gets converted into

*** IR Dump After Expand ISel Pseudo-instructions ***:

Machine code for function foo: SSA

Function Live Ins: %EDI in %vreg0, %ESI in %vreg1

BB#0: derived from LLVM BB %entry Live Ins: %EDI %ESI %vreg1 = COPY %ESI; GR32:%vreg1 %vreg0 = COPY %EDI; GR32:%vreg0 %vreg2<def,tied1> = ADD32rr %vreg1, %vreg0, %EFLAGS<imp-def,dead> ; GR32:%vreg2,%vreg1,%vreg0

in ISEL. So, the necessary instruction semantic needn't be represented in LLVM IR. It is created once you have to do mapping to "real" machine instructions using virtual registers, where copies, and the ability to mark a destination and a source as "tied" together are representable.

Kevin

-----Original Message----- From: dag at cray.com [mailto:dag at cray.com] Sent: Friday, October 24, 2014 12:23 PM To: Smith, Kevin B Cc: Demikhovsky, Elena; llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Adding masked vector load and store intrinsics

"Smith, Kevin B" <kevin.b.smith at intel.com> writes:

So %passthrough can only be undef or zeroinitializer? No, that wasn't the intent. %passthrough can be any other definition that is needed. Zero and undef were simply two possible values that illustrated some interesting behavior.

Mapping of the %passthrough to the actual semantics of many vector instruction sets where the masked instructions leave the masked-off elements of the destination unchanged is done in a similar manner as three-address instructions are turned into two address instructions, by placing a copy as necessary so that dest and passthrough are in the same register.

How would one express such semantics in LLVM IR with this intrinsic? By definition, %data anmd %passthrough are different IR virtual registers and there are no copy instructions in LLVM IR.

In the more general case:

%b = call <8 x i32> @llvm.masked.load (i32* %addr, <8 x i32> %a, i32 4, <8 x i1> %mask)

where %a and %b have no relation to each other, I presume the backend would be responsible for doing a select/merge after the load if the ISA didn't directly support the merge as part of the load operation. Right?

                             -David


More information about the llvm-dev mailing list