LoongArch Function Attributes (Using the GNU Compiler Collection (GCC)) (original) (raw)
target (option,...) ¶
The following target-specific function attributes are available for the LoongArch target. These options mirror the behavior of similar command-line options (see LoongArch Options), but on a per-function basis.
strict-align ¶
no-strict-align
strict-align indicates that the compiler should not assume that unaligned memory references are handled by the system. To allow the compiler to assume that aligned memory references are handled by the system, the inverse attributeno-strict-align can be specified. The behavior is same as for the command-line option -mstrict-align and -mno-strict-align.
cmodel= ¶
Indicates that code should be generated for a particular code model for this function. The behavior and permissible arguments are the same as for the command-line option -mcmodel=.
arch= ¶
Specifies the architecture version and architectural extensions to use for this function. The behavior and permissible arguments are the same as for the -march= command-line option.
tune= ¶
Specifies the core for which to tune the performance of this function. The behavior and permissible arguments are the same as for the -mtune=command-line option.
lsx ¶
no-lsx
lsx indicates that vector instruction generation is allowed (not allowed) when compiling the function. The behavior is same as for the command-line option-mlsx and -mno-lsx.
lasx ¶
no-lasx
lasx indicates that lasx instruction generation is allowed (not allowed) when compiling the function. The behavior is slightly different from the command-line option -mno-lasx. Example:
test.c: typedef int v4i32 attribute ((vector_size(16), aligned(16)));
v4i32 a, b, c; #ifdef WITH_ATTR attribute ((target("no-lasx"))) void #else void #endif test () { c = a + b; }
$ gcc test.c -o test.s -O2 -mlasx -DWITH_ATTR
Compiled as above, 128-bit vectorization is possible. But the following method cannot perform 128-bit vectorization.
$ gcc test.c -o test.s -O2 -mlasx -mno-lasx
recipe ¶
no-recipe
recipe indicates that frecipe.{s/d} and frsqrt.{s/d}instruction generation is allowed (not allowed) when compiling the function. The behavior is same as for the command-line option-mrecipe and -mno-recipe.
div32 ¶
no-div32
div32 determines whether div.w[u] and mod.w[u] instructions on 64-bit machines are evaluated based only on the lower 32 bits of the input registers.-mdiv32 and -mno-div32.
lam-bh ¶
no-lam-bh
lam-bh indicates that am{swap/add}[_db].{b/h} instruction generation is allowed (not allowed) when compiling the function. The behavior is same as for the command-line option-mlam-bh and -mno-lam-bh.
lamcas ¶
no-lamcas
lamcas indicates that amcas[_db].{b/h/w/d} instruction generation is allowed (not allowed) when compiling the function. The behavior is same as for the command-line option-mlamcas and -mno-lamcas.
scq ¶
no-scq
scq indicates that sc.q instruction generation is allowed (not allowed) when compiling the function. The behavior is same as for the command-line option-mscq and -mno-scq.
ld-seq-sa ¶
no-ld-seq-sa
ld-seq-sa indicates that whether need load-load barries (dbar 0x700)-mld-seq-sa and -mno-ld-seq-sa.
Multiple target function attributes can be specified by separating them with a comma. For example:
attribute((target("arch=la64v1.1,lasx"))) int foo (int a) { return a + 5; }
is valid and compiles function foo for LA64V1.1 with lasx.
Inlining rules ¶
Specifying target attributes on individual functions or performing link-time optimization across translation units compiled with different target options can affect function inlining rules:
In particular, a caller function can inline a callee function only if the architectural features available to the callee are a subset of the features available to the caller.
Note that when the callee function does not have the always_inline attribute, it will not be inlined if the code model of the caller function is different from the code model of the callee function.
target_clones (string,...) ¶
Like attribute target, these options also reflect the behavior of similar command line options.
string can take the following values:
- default
- strict-align
- arch=
- lsx
- lasx
- frecipe
- div32
- lam-bh
- lamcas
- scq
- ld-seq-sa
You can set the priority of attributes in target_clones (except default). For example:
attribute((target_clones ("default","arch=la64v1.1","lsx;priority=1"))) int foo (int a) { return a + 5; }
The priority is from low to high:
- default
- arch=loongarch64
- strict-align
- frecipe = div32 = lam-bh = lamcas = scq = ld-seq-sa
- lsx
- arch=la64v1.0
- arch=la64v1.1
- lasx
Note that the option values on the gcc command line are not considered when calculating the priority.
If a priority is set for a feature in target_clones, then the priority of this feature will be higher than lasx.
For example:
attribute((target_clones ("default","arch=la64v1.1","lsx;priority=1"))) int foo (int a) { return a + 5; }
In this test case, the priority of lsx is higher than that ofarch=la64v1.1.
If the same priority is explicitly set for two features, the priority is still calculated according to the priority list above.
For example:
attribute((target_clones ("default","arch=la64v1.1;priority=1","lsx;priority=1"))) int foo (int a) { return a + 5; }
In this test case, the priority of arch=la64v1.1;priority=1 is higher than that of lsx;priority=1.
target_version (string) ¶
Support attributes and priorities are the same as target_clones. Note that this attribute requires GLIBC2.38 and newer that support HWCAP.
For example:
test1.C
attribute((target_clones ("default","arch=la64v1.1","lsx;priority=1"))) int foo (int a) { return a + 5; }
test2.C
attribute((target_version ("default"))) int foo (int a) { return a + 5; } attribute((target_version ("arch=la64v1.1"))) int foo (int a) { return a + 5; } attribute((target_version ("lsx;priority=1"))) int foo (int a) { return a + 5; }
The implementations of test1.C and test2.C are equivalent.