[llvm-dev] [RFC] Adding CPS call support (original) (raw)

Friedman, Eli via llvm-dev llvm-dev at lists.llvm.org
Mon Apr 17 11:47:42 PDT 2017


On 4/17/2017 8:30 AM, Kavon Farvardin via llvm-dev wrote:

Summary =======

There is a need for dedicated continuation-passing style (CPS) calls in LLVM to support functional languages. Herein I describe the problem and propose a solution. Feedback and/or tips are greatly appreciated, as our goal is to implement these changes so they can be merged into LLVM trunk.

Problem ======= Implementations of functional languages like Haskell and ML (e.g., GHC and Manticore) use a continuation-passing style (CPS) transformation in order to manage the call stack explicitly. This is done prior to generating LLVM IR, so the implicit call stack within LLVM is not used for call and return. When making a non-tail call while in CPS, we initialize a stack frame for the return through our own stack pointer, and then pass that stack pointer to the callee when we jump to it. It is here when we run into a problem in LLVM. Consider the following CPS call to @bar and how it will return: ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; define void @foo (i8** %sp, ...) { someBlk: ; ... ; finish stack frame by writing return address %retAddr = blockaddress(@foo, %retpt) store i8* %retAddr, i8** %sp ; jump to @bar tail call void @bar(i8** %sp, ...) retpt: ; <- how can @bar "call" %retpt? %sp2 = ??? %val = ??? ; ... } define void @bar (i8** %sp, ...) { ; perform a return %retAddr0 = load i8*, i8** %sp %retAddr1 = bitcast i8* %retAddr0 to void (i8**, i64)* %val = bitcast i64 1 to i64 ; jump back to %retpt in @foo, passing %sp and %val tail call void %retAddr1(i8** %sp, i64 %val) } ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; There is currently no way to jump back to %retpt from another function, as block addresses have restricted usage in LLVM [1]. Our main difficulty is that we cannot jump to a block address without knowing its calling convention, i.e., the particular machine registers (or memory locations) that the block expects incoming values to be passed in. The workaround we have been using in GHC for LLVM is to break apart every function, placing the code for the continuation of each call into a new function. We do this only so that we can store a function pointer instead of a block address to our stack. This particularly gross transformation inhibits optimizations in both GHC and LLVM, and we would like to remove the need for it. Proposal ======== I believe the lowest-impact method of fixing this problem with LLVM is the following: First, we add a special 'cps' call instruction marker to be used on non-tail calls. Then, we use a specialized calling convention for these non-tail calls, which fix the returned values to specific locations in the machine code [2]. To help illustrate what's going on, let's rewrite the above example using the proposed 'cps' call: ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; define { ... } @foo (i8** %sp, ...) { someBlk: ; ... ; finish stack frame by writing return address %retAddr = blockaddress(@foo, %retpt) store i8* %retAddr, i8** %sp ; jump to @bar %retVals = cps call ghccc {i8**, i64} @bar (i8** %sp, ...) br label %retpt

I'm not following how explicitly representing the return address of a call in the IR before isel actually solves any relevant issue. We already pass the return address implicitly as an argument to every call; you can retrieve it with llvm.returnaddress if you need it.

You can't branch across functions like you're proposing because the stack and callee-save registers won't be in the right state. LLVM will inevitably save values to the stack for calls which are not tail calls.
Therefore, this proposal simply doesn't work.

-Eli

-- Employee of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project



More information about the llvm-dev mailing list