(original) (raw)

Hello,

I�m a newbie here, working on a project to enforce Control Flow Integrity (CFI) on programs compiled with LLVM. We�re using LLVM 3.3 so we can leverage poolalloc's dsa analysis. Ideally this will be as target-independent as possible, but our primary target is ARM. One of our passes requires inserting different i32 IDs at various points into the code we�re compiling. As far as I can tell, it�s impossible to with just LLVM IR, so we�re looking into ways of getting these IDs through the CodeGen.

One thing that looked promising is the function �prefix� value in LLVM 3.4, which is able to emit a global value into the asm. This is the right idea except we need it at arbitrary points in code. We then looked at defining a custom intrinsic function (@llvm.cfiid) that we can insert into the IR and then lower to assembly. It didn�t seem like this was exactly what we wanted either, because the asm that is generated has to be target dependent. We�ve checked out the poolalloc/safecoode projects and there�s some helpful analysis tools, but didn�t find anything relevant to ID lowering.

Our current thrust is to define a custom target intrinsic function (@llvm.arm.cfiid) that we can insert into the IR and lower using a definition in the ARMInstrInfo.td file. Right now, I�m trying to define the pattern and instruction in that file. At first, I just inserted a pattern to lower our intrinsic into a �trap� instruction, which worked fine:

/\* Code in IR/IntrinsicsARM.td \*/
/\* Note, I�m not positive that IntrNoReturn is correct here, but IntrNoMem type wouldn�t lower to an SDNode because of lack of �results� \*/
def int\_arm\_cfiid : Intrinsic<\[\], \[llvm\_i32\_ty\], \[IntrNoReturn\]>;

/\* Code in Target/ARM/ARMInstrInfo.td \*/
def : Pat<(int\_arm\_cfiid (i32 imm)),
(TRAP)>;
...

Next, I�m trying to create my own �AXI� definition based on the TRAP definition, and then put that into the pattern. I admit that I don�t fully grok the tablegen syntax, so a lot of what I�ve been doing is trial and error, and based on examples in other \*.td files.

Here�s what I think I�m shooting for...

/\* Code in Target/ARM/ARMInstrInfo.td \*/
def ARMCFIID : AXI<(outs), (ins i32imm:$opt), MiscFrm, NoItinerary,
"cfiid", "\\t$opt", \[(int\_arm\_cfiid i32imm:$opt)\]>,
Requires<\[IsARM\]> {
bits<32> opt;
let Inst{31-0} = opt;
}
...

I realize this is very wrong, but just to give you an idea of what I�m trying to do� basically take the i32 param of the intrinsic and encode it as a raw bytes. Obviously, this is broke�

TL;DR:
  • What�s the best way to lower an IR i32 into code as raw bytes?
  • If an Intrinsic is the answer, can it be done entirely in the TableGen files or do I need to do some SDNode stuff as well?
  • If a TargetIntrinsic is the answer, what�s the proper syntax to define an ARM Instruction and matching it with my intrinsic pattern?

Sorry if this is pretty basic stuff� I�ve been looking at the archives and couldn�t find any other threads that worked for me.

Also, I noticed that there is an llvm-devs google group as well. Is it faux-pas to cross-post to that list as well, or are these lists disjoint enough that it wouldn�t be spammy?


Thanks,
Joe

--
Joseph Battaglia
M.S. Information Security '14
Information Networking Institute
Carnegie Mellon University