ORCv2 (LLJIT): reusing the llvm::Module (original) (raw)

Overview: I used the ORCv2 API to build a small binary rewriting/emulation proof-of-concept similar to vmill (or instrew), lifting binary code to LLVM-IR via Remill (or Rellume), but relying on a non-deprecated JIT engine.

Context: I want to use a single llvm::Module, adding it to a LLJIT instance at the beginning and incrementally adding new functions to it, that I want the JIT to compile and execute when needed. The simplified execution loop follows:

A new LLJIT instance and a new llvm::Module (referred to as MOD) are created. Then MOD is added to the LLJIT instance via addIRModule.
A new basic block is lifted from binary to LLVM-IR, creating a new function into the MOD module.
The MOD module is processed by LLJIT calling the lookup function and the basic block we just lifted in step 2 is recompiled and executed, yielding the address(es) of the new basic block(s) to lift.
Go to step 2 until we processed all the basic blocks we care about.

My understanding is that the changes done to MOD (e.g. creation of a new function) after adding it to the LLJIT instance via addIRModule won’t be seen, and therefore won’t be compiled, when the lookup function is invoked.

This brings me to think that I’d need to lift each new basic block into a dedicated llvm::Module, then call addIRModule to let the LLJIT instance know about it and finally invoke lookup to trigger the compilation and execution of the new recompiled basic block.

Questions:

Is there a way to incrementally update a llvm::Module that has already been added to a LLJIT instance (or to lower level ORCv2 classes) such that the new functions will be visible when lookup is invoked?
If not, is the use-case I described completely out of scope for the ORCv2 API? It looks like others in the past stumbled upon the same doubt: incremental code generation and reusing llvm::Module for JIT-ting.