[llvm-dev] [RFC] Introducing an explicit calling convention (original) (raw)

Philip Reames via llvm-dev llvm-dev at lists.llvm.org
Tue Jan 15 15:02:40 PST 2019


I generally support the goal - we have the same problem described in a bit of detail just below - but I'm really not sure about the framing of the solution here.

Our variation on the problem is that we have many distinct calling conventions which are slight variation for each other. We're adapting a legacy collection of hand written assembly stubs each which had a slightly different calling convention.  The typical difference is that one stub might kill a register that another preserves.  At the moment, we've solved this through a mixture of a bunch of custom calling conventions declared downstream, and normalizing stubs where possible.  The former is tedious; the later is rather error prone. The key thing for us is that variation between stubs is primarily small and mostly in the callee saved lists.

On the framing piece, the ABI is really a property of the callee, not of the arguments.  As such, I think this really deserves to either be a first class syntax for spelling a calling convention, or an attribute.  I'd suggest framing the description of the calling convention as "like this existing calling convention XXX, but w/o this callee saved register" or "like this existing calling convention XYZ, but with one argument register defined".  Possibly spellings might include:

declare "C" {i64, i64} @example(i64 %a, i64 %b) ccoverride(noclobber={rax, rbx}, arg_reg_order={rcx, rdx}, ret_arg_regs={rdx, rcx)) declare "C" {i64, i64} @example(i64 %a, i64 %b) ccoverride="(noclobber={rax, rbx}, arg_reg_order={rcx, rdx}, ret_arg_regs={rdx, rcx))"

(The second one abuses String attributes horribly, but would be the easiest to implement.)

An alternate way of framing this would be to provide a clean interface for plugging in an externally defined calling convention w/o needing to rebuild LLVM.  This would require a custom driver, but would avoid the need to build LLVM.  This would solve our problem cleanly - and is probably what I'd get around to implementing someday - but I'm not sure how it matches your original use case.

Philip

On 1/15/19 12:20 AM, Frej Drejhammar via llvm-dev wrote:

Hi All,

TLDR: Allow calling conventions to be defined on-the-fly for functions in LLVM-IR, comments are requested on the mechanism and syntax. Summary ======= This is a proposal for adding a mechanism by which LLVM can be used to generate code fragments adhering to an arbitrary calling convention. Intended use cases are: generating code intended to be called from the shadow of a stackmap or patchpoint; generating the target function of a statepoint; or simply for generating a piece of shell-code during reverse engineering or binary patching. Motivation ========== The LLVM assembly language provides stackmaps, patchpoints, and statepoints which all provide the user with the value or storage location of operands given to the respective intrinsic. All three intrinsics emit the information in a special stackmap section [3] of the produced object file. The previous three intrinsics are useful to the implementer of a JIT-compiler for an interpreted language [2] (the author's use case) as stackmaps can be used to incrementally extend blocks of native code and a statepoint can both be used as a mechanism to call native code and as a landing-pad for reentry from native-code. Other uses, such as inserting a stackmap and later overwriting its shadow with a call to logging function are also possible. The information in the stackmap section can be seen as a custom calling convention which is unique for this particular location. Unfortunately there is currently no way to define the details of a LLVM calling convention dynamically, as LLVM only allows the user to choose among a fixed set of predefined conventions. Approach ======== This proposal adds a new calling convention called 'explicitcc', which can be applied to void functions. A function using the explicit calling convention requires that each element of the argument list has a parameter attribute 'hwreg(metadata)' specifying the register from which the argument gets its value. An 'explicit' function can have an optional 'noclobber(metadata)' function attribute to tell the compiler which registers are to be treated as callee save. Additionally a new '@llvm.experimental.retwr(...)' (standing for return with registers) intrinsic is introduced. By giving each parameter to retwr a hwreg attribute, it allows the 'explicit' function to return to its caller with a defined register state. Only parameters passed in registers are considered as the llvm.addressofreturnaddress intrinsic can be used to calculate the location of values on the callers stack. Example ======= The following is a function which exchanges the values of rcx and rdx without clobbering rax and rbx. define explicitcc void @example(i64 hwreg(metadata !1) %a, i64 hwreg(metadata !2) %b) noclobber(metadata !0) { call void (...) @llvm.experimental.retwr(i64 hwreg(metadata !2) %a, i64 hwreg(metadata !1) %b) ret void } !0 = !{!"rax", !"rbx"} !1 = !{!"rcx"} !2 = !{!"rdx"} Open Questions ============== Are parameter attributes the best way to encode the register information? The metadata reference requires adding a pointer to the ISD::ArgFlagsTy struct, thus growing it by 50%, is this acceptable? An alternative could be to instead have a function attribute which points to a metadata tuple with the explicit registers. References ========== [1] https://llvm.org/docs/StackMaps.html#stack-map-format [2] https://dl.acm.org/citation.cfm?id=2633450 [3] https://llvm.org/docs/StackMaps.html#stack-map-section Regards, --Frej Drejhammar


LLVM Developers mailing list llvm-dev at lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev



More information about the llvm-dev mailing list