Code generation - The Rust Reference (original) (raw)

Keyboard shortcuts

Press ← or → to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

The Rust Reference

Code generation attributes

The following attributes are used for controlling code generation.

Optimization hints

The cold and inline attributes give suggestions to generate code in a way that may be faster than what it would do without the hint. The attributes are only hints, and may be ignored.

Both attributes can be used on functions. When applied to a function in atrait, they apply only to that function when used as a default function for a trait implementation and not to all trait implementations. The attributes have no effect on a trait function without a body.

The inline attribute

The inline attribute suggests that a copy of the attributed function should be placed in the caller, rather than generating code to call the function where it is defined.

Note

The rustc compiler automatically inlines functions based on internal heuristics. Incorrectly inlining functions can make the program slower, so this attribute should be used with care.

There are three ways to use the inline attribute:

#[inline] suggests performing an inline expansion.
#[inline(always)] suggests that an inline expansion should always be performed.
#[inline(never)] suggests that an inline expansion should never be performed.

Note

#[inline] in every form is a hint, with no requirements on the language to place a copy of the attributed function in the caller.

The cold attribute

The cold attribute suggests that the attributed function is unlikely to be called.

The naked attribute

The naked attribute prevents the compiler from emitting a function prologue and epilogue for the attributed function.

[attributes.codegen.naked.body]

The function body must consist of exactly one naked_asm! macro invocation.

No function prologue or epilogue is generated for the attributed function. The assembly code in the naked_asm! block constitutes the full body of a naked function.

The naked attribute is an unsafe attribute. Annotating a function with #[unsafe(naked)] comes with the safety obligation that the body must respect the function’s calling convention, uphold its signature, and either return or diverge (i.e., not fall through past the end of the assembly code).

The assembly code may assume that the call stack and register state are valid on entry as per the signature and calling convention of the function.

The assembly code may not be duplicated by the compiler except when monomorphizing polymorphic functions.

Note

Guaranteeing when the assembly code may or may not be duplicated is important for naked functions that define symbols.

The unused_variables lint is suppressed within naked functions.

The inline attribute cannot by applied to a naked function.

The track_caller attribute cannot be applied to a naked function.

The testing attributes cannot be applied to a naked function.

The no_builtins attribute

The nobuiltins attribute may be applied at the crate level to disable optimizing certain code patterns to invocations of library functions that are assumed to exist.

The target_feature attribute

The targetfeature attribute may be applied to a function to enable code generation of that function for specific platform architecture features. It uses the MetaListNameValueStr syntax with a single key ofenable whose value is a string of comma-separated feature names to enable.

#![allow(unused)]
fn main() {
#[cfg(target_feature = "avx2")]
#[target_feature(enable = "avx2")]
fn foo_avx2() {}
}

Each target architecture has a set of features that may be enabled. It is an error to specify a feature for a target architecture that the crate is not being compiled for.

Closures defined within a target_feature-annotated function inherit the attribute from the enclosing function.

It is undefined behavior to call a function that is compiled with a feature that is not supported on the current platform the code is running on, _except_if the platform explicitly documents this to be safe.

The following restrictions apply unless otherwise specified by the platform rules below:

Safe #[target_feature] functions (and closures that inherit the attribute) can only be safely called within a caller that enables all the target_features that the callee enables. This restriction does not apply in an unsafe context.
Safe #[target_feature] functions (and closures that inherit the attribute) can only be coerced to safe function pointers in contexts that enable all the target_features that the coercee enables. This restriction does not apply to unsafe function pointers.

Implicitly enabled features are included in this rule. For example an sse2 function can call ones marked with sse.

#![allow(unused)]
fn main() {
#[cfg(target_feature = "sse2")] {
#[target_feature(enable = "sse")]
fn foo_sse() {}

fn bar() {
    // Calling `foo_sse` here is unsafe, as we must ensure that SSE is
    // available first, even if `sse` is enabled by default on the target
    // platform or manually enabled as compiler flags.
    unsafe {
        foo_sse();
    }
}

#[target_feature(enable = "sse")]
fn bar_sse() {
    // Calling `foo_sse` here is safe.
    foo_sse();
    || foo_sse();
}

#[target_feature(enable = "sse2")]
fn bar_sse2() {
    // Calling `foo_sse` here is safe because `sse2` implies `sse`.
    foo_sse();
}
}
}

A function with a #[target_feature] attribute never implements the Fn family of traits, although closures inheriting features from the enclosing function do.

The #[target_feature] attribute is not allowed on the following places:

the main function
a panic_handler function
safe trait methods
safe default functions in traits

Functions marked with target_feature are not inlined into a context that does not support the given features. The #[inline(always)] attribute may not be used with a target_feature attribute.

Available features

The following is a list of the available feature names.

x86 or x86_64

Executing code with unsupported features is undefined behavior on this platform. Hence on this platform usage of #[target_feature] functions follows theabove restrictions.

Feature	Implicitly Enables	Description
adx	ADX — Multi-Precision Add-Carry Instruction Extensions
aes	sse2	AES — Advanced Encryption Standard
avx	sse4.2	AVX — Advanced Vector Extensions
avx2	avx	AVX2 — Advanced Vector Extensions 2
avx512bf16	avx512bw	AVX512-BF16 — Advanced Vector Extensions 512-bit - Bfloat16 Extensions
avx512bitalg	avx512bw	AVX512-BITALG — Advanced Vector Extensions 512-bit - Bit Algorithms
avx512bw	avx512f	AVX512-BW — Advanced Vector Extensions 512-bit - Byte and Word Instructions
avx512cd	avx512f	AVX512-CD — Advanced Vector Extensions 512-bit - Conflict Detection Instructions
avx512dq	avx512f	AVX512-DQ — Advanced Vector Extensions 512-bit - Doubleword and Quadword Instructions
avx512f	avx2, fma, f16c	AVX512-F — Advanced Vector Extensions 512-bit - Foundation
avx512fp16	avx512bw	AVX512-FP16 — Advanced Vector Extensions 512-bit - Float16 Extensions
avx512ifma	avx512f	AVX512-IFMA — Advanced Vector Extensions 512-bit - Integer Fused Multiply Add
avx512vbmi	avx512bw	AVX512-VBMI — Advanced Vector Extensions 512-bit - Vector Byte Manipulation Instructions
avx512vbmi2	avx512bw	AVX512-VBMI2 — Advanced Vector Extensions 512-bit - Vector Byte Manipulation Instructions 2
avx512vl	avx512f	AVX512-VL — Advanced Vector Extensions 512-bit - Vector Length Extensions
avx512vnni	avx512f	AVX512-VNNI — Advanced Vector Extensions 512-bit - Vector Neural Network Instructions
avx512vp2intersect	avx512f	AVX512-VP2INTERSECT — Advanced Vector Extensions 512-bit - Vector Pair Intersection to a Pair of Mask Registers
avx512vpopcntdq	avx512f	AVX512-VPOPCNTDQ — Advanced Vector Extensions 512-bit - Vector Population Count Instruction
avxifma	avx2	AVX-IFMA — Advanced Vector Extensions - Integer Fused Multiply Add
avxneconvert	avx2	AVX-NE-CONVERT — Advanced Vector Extensions - No-Exception Floating-Point conversion Instructions
avxvnni	avx2	AVX-VNNI — Advanced Vector Extensions - Vector Neural Network Instructions
avxvnniint16	avx2	AVX-VNNI-INT16 — Advanced Vector Extensions - Vector Neural Network Instructions with 16-bit Integers
avxvnniint8	avx2	AVX-VNNI-INT8 — Advanced Vector Extensions - Vector Neural Network Instructions with 8-bit Integers
bmi1	BMI1 — Bit Manipulation Instruction Sets
bmi2	BMI2 — Bit Manipulation Instruction Sets 2
cmpxchg16b	cmpxchg16b — Compares and exchange 16 bytes (128 bits) of data atomically
f16c	avx	F16C — 16-bit floating point conversion instructions
fma	avx	FMA3 — Three-operand fused multiply-add
fxsr	fxsave and fxrstor — Save and restore x87 FPU, MMX Technology, and SSE State
gfni	sse2	GFNI — Galois Field New Instructions
kl	sse2	KEYLOCKER — Intel Key Locker Instructions
lzcnt	lzcnt — Leading zeros count
movbe	movbe — Move data after swapping bytes
pclmulqdq	sse2	pclmulqdq — Packed carry-less multiplication quadword
popcnt	popcnt — Count of bits set to 1
rdrand	rdrand — Read random number
rdseed	rdseed — Read random seed
sha	sse2	SHA — Secure Hash Algorithm
sha512	avx2	SHA512 — Secure Hash Algorithm with 512-bit digest
sm3	avx	SM3 — ShangMi 3 Hash Algorithm
sm4	avx2	SM4 — ShangMi 4 Cipher Algorithm
sse	SSE — Streaming SIMD Extensions
sse2	sse	SSE2 — Streaming SIMD Extensions 2
sse3	sse2	SSE3 — Streaming SIMD Extensions 3
sse4.1	ssse3	SSE4.1 — Streaming SIMD Extensions 4.1
sse4.2	sse4.1	SSE4.2 — Streaming SIMD Extensions 4.2
ssse3	sse3	SSSE3 — Supplemental Streaming SIMD Extensions 3
vaes	avx2, aes	VAES — Vector AES Instructions
vpclmulqdq	avx, pclmulqdq	VPCLMULQDQ — Vector Carry-less multiplication of Quadwords
widekl	kl	KEYLOCKER_WIDE — Intel Wide Keylocker Instructions
xsave	xsave — Save processor extended states
xsavec	xsavec — Save processor extended states with compaction
xsaveopt	xsaveopt — Save processor extended states optimized
xsaves	xsaves — Save processor extended states supervisor

aarch64

On this platform the usage of #[target_feature] functions follows theabove restrictions.

Further documentation on these features can be found in the ARM Architecture Reference Manual, or elsewhere on developer.arm.com.

Note

The following pairs of features should both be marked as enabled or disabled together if used:

paca and pacg, which LLVM currently implements as one feature.

Feature	Implicitly Enables	Feature Name
aes	neon	FEAT_AES & FEAT_PMULL — Advanced SIMD AES & PMULL instructions
bf16	FEAT_BF16 — BFloat16 instructions
bti	FEAT_BTI — Branch Target Identification
crc	FEAT_CRC — CRC32 checksum instructions
dit	FEAT_DIT — Data Independent Timing instructions
dotprod	neon	FEAT_DotProd — Advanced SIMD Int8 dot product instructions
dpb	FEAT_DPB — Data cache clean to point of persistence
dpb2	dpb	FEAT_DPB2 — Data cache clean to point of deep persistence
f32mm	sve	FEAT_F32MM — SVE single-precision FP matrix multiply instruction
f64mm	sve	FEAT_F64MM — SVE double-precision FP matrix multiply instruction
fcma	neon	FEAT_FCMA — Floating point complex number support
fhm	fp16	FEAT_FHM — Half-precision FP FMLAL instructions
flagm	FEAT_FLAGM — Conditional flag manipulation
fp16	neon	FEAT_FP16 — Half-precision FP data processing
frintts	FEAT_FRINTTS — Floating-point to int helper instructions
i8mm	FEAT_I8MM — Int8 Matrix Multiplication
jsconv	neon	FEAT_JSCVT — JavaScript conversion instruction
lor	FEAT_LOR — Limited Ordering Regions extension
lse	FEAT_LSE — Large System Extensions
mte	FEAT_MTE & FEAT_MTE2 — Memory Tagging Extension
neon	FEAT_AdvSimd & FEAT_FP — Floating Point and Advanced SIMD extension
paca	FEAT_PAUTH — Pointer Authentication (address authentication)
pacg	FEAT_PAUTH — Pointer Authentication (generic authentication)
pan	FEAT_PAN — Privileged Access-Never extension
pmuv3	FEAT_PMUv3 — Performance Monitors extension (v3)
rand	FEAT_RNG — Random Number Generator
ras	FEAT_RAS & FEAT_RASv1p1 — Reliability, Availability and Serviceability extension
rcpc	FEAT_LRCPC — Release consistent Processor Consistent
rcpc2	rcpc	FEAT_LRCPC2 — RcPc with immediate offsets
rdm	neon	FEAT_RDM — Rounding Double Multiply accumulate
sb	FEAT_SB — Speculation Barrier
sha2	neon	FEAT_SHA1 & FEAT_SHA256 — Advanced SIMD SHA instructions
sha3	sha2	FEAT_SHA512 & FEAT_SHA3 — Advanced SIMD SHA instructions
sm4	neon	FEAT_SM3 & FEAT_SM4 — Advanced SIMD SM3/4 instructions
spe	FEAT_SPE — Statistical Profiling Extension
ssbs	FEAT_SSBS & FEAT_SSBS2 — Speculative Store Bypass Safe
sve	neon	FEAT_SVE — Scalable Vector Extension
sve2	sve	FEAT_SVE2 — Scalable Vector Extension 2
sve2-aes	sve2, aes	FEAT_SVE_AES & FEAT_SVE_PMULL128 — SVE AES instructions
sve2-bitperm	sve2	FEAT_SVE2_BitPerm — SVE Bit Permute
sve2-sha3	sve2, sha3	FEAT_SVE2_SHA3 — SVE SHA3 instructions
sve2-sm4	sve2, sm4	FEAT_SVE2_SM4 — SVE SM4 instructions
tme	FEAT_TME — Transactional Memory Extension
vh	FEAT_VHE — Virtualization Host Extensions

loongarch

On this platform the usage of #[target_feature] functions follows theabove restrictions.

Feature	Implicitly Enables	Description
f	F — Single-precision float-point instructions
d	f	D — Double-precision float-point instructions
frecipe	FRECIPE — Reciprocal approximation instructions
lasx	lsx	LASX — 256-bit vector instructions
lbt	LBT — Binary translation instructions
lsx	d	LSX — 128-bit vector instructions
lvz	LVZ — Virtualization instructions

riscv32 or riscv64

On this platform the usage of #[target_feature] functions follows theabove restrictions.

Further documentation on these features can be found in their respective specification. Many specifications are described in the RISC-V ISA Manual or in another manual hosted on the RISC-V GitHub Account.

Feature	Implicitly Enables	Description
a	A — Atomic instructions
c	C — Compressed instructions
m	M — Integer Multiplication and Division instructions
zb	zba, zbc, zbs	Zb — Bit Manipulation instructions
zba	Zba — Address Generation instructions
zbb	Zbb — Basic bit-manipulation
zbc	Zbc — Carry-less multiplication
zbkb	Zbkb — Bit Manipulation Instructions for Cryptography
zbkc	Zbkc — Carry-less multiplication for Cryptography
zbkx	Zbkx — Crossbar permutations
zbs	Zbs — Single-bit instructions
zk	zkn, zkr, zks, zkt, zbkb, zbkc, zkbx	Zk — Scalar Cryptography
zkn	zknd, zkne, zknh, zbkb, zbkc, zkbx	Zkn — NIST Algorithm suite extension
zknd	Zknd — NIST Suite: AES Decryption
zkne	Zkne — NIST Suite: AES Encryption
zknh	Zknh — NIST Suite: Hash Function Instructions
zkr	Zkr — Entropy Source Extension
zks	zksed, zksh, zbkb, zbkc, zkbx	Zks — ShangMi Algorithm Suite
zksed	Zksed — ShangMi Suite: SM4 Block Cipher Instructions
zksh	Zksh — ShangMi Suite: SM3 Hash Function Instructions
zkt	Zkt — Data Independent Execution Latency Subset

wasm32 or wasm64

Safe #[target_feature] functions may always be used in safe contexts on Wasm platforms. It is impossible to cause undefined behavior via the#[target_feature] attribute because attempting to use instructions unsupported by the Wasm engine will fail at load time without the risk of being interpreted in a way different from what the compiler expected.

Additional information

See the target_feature conditional compilation option for selectively enabling or disabling compilation of code based on compile-time settings. Note that this option is not affected by the target_feature attribute, and is only driven by the features enabled for the entire crate.

See the is_x86_feature_detected or is_aarch64_feature_detected macros in the standard library for runtime feature detection on these platforms.

Note

rustc has a default set of features enabled for each target and CPU. The CPU may be chosen with the -C target-cpu flag. Individual features may be enabled or disabled for an entire crate with the -C target-feature flag.

The track_caller attribute

The track_caller attribute may be applied to any function with "Rust" ABIwith the exception of the entry point fn main.

When applied to functions and methods in trait declarations, the attribute applies to all implementations. If the trait provides a default implementation with the attribute, then the attribute also applies to override implementations.

When applied to a function in an extern block the attribute must also be applied to any linked implementations, otherwise undefined behavior results. When applied to a function which is made available to an extern block, the declaration in the extern block must also have the attribute, otherwise undefined behavior results.

Behavior

Applying the attribute to a function f allows code within f to get a hint of the Location of the “topmost” tracked call that led to f’s invocation. At the point of observation, an implementation behaves as if it walks up the stack from f’s frame to find the nearest frame of an_unattributed_ function outer, and it returns the Location of the tracked call in outer.

#![allow(unused)]
fn main() {
#[track_caller]
fn f() {
    println!("{}", std::panic::Location::caller());
}
}

Note

Because the resulting Location is a hint, an implementation may halt its walk up the stack early. See Limitations for important caveats.

Examples

When f is called directly by calls_f, code in f observes its callsite within calls_f:

#![allow(unused)]
fn main() {
#[track_caller]
fn f() {
    println!("{}", std::panic::Location::caller());
}
fn calls_f() {
    f(); // <-- f() prints this location
}
}

When f is called by another attributed function g which is in turn called by calls_g, code in both f and g observes g’s callsite within calls_g:

#![allow(unused)]
fn main() {
#[track_caller]
fn f() {
    println!("{}", std::panic::Location::caller());
}
#[track_caller]
fn g() {
    println!("{}", std::panic::Location::caller());
    f();
}

fn calls_g() {
    g(); // <-- g() prints this location twice, once itself and once from f()
}
}

When g is called by another attributed function h which is in turn called by calls_h, all code in f, g, and h observes h’s callsite within calls_h:

#![allow(unused)]
fn main() {
#[track_caller]
fn f() {
    println!("{}", std::panic::Location::caller());
}
#[track_caller]
fn g() {
    println!("{}", std::panic::Location::caller());
    f();
}
#[track_caller]
fn h() {
    println!("{}", std::panic::Location::caller());
    g();
}

fn calls_h() {
    h(); // <-- prints this location three times, once itself, once from g(), once from f()
}
}

And so on.

Limitations

This information is a hint and implementations are not required to preserve it.

In particular, coercing a function with #[track_caller] to a function pointer creates a shim which appears to observers to have been called at the attributed function’s definition site, losing actual caller information across virtual calls. A common example of this coercion is the creation of a trait object whose methods are attributed.

Note

The aforementioned shim for function pointers is necessary because rustc implements track_caller in a codegen context by appending an implicit parameter to the function ABI, but this would be unsound for an indirect call because the parameter is not a part of the function’s type and a given function pointer type may or may not refer to a function with the attribute. The creation of a shim hides the implicit parameter from callers of the function pointer, preserving soundness.

The instruction_set attribute

The instructionset attribute may be applied to a function to control which instruction set the function will be generated for.

This allows mixing more than one instruction set in a single program on CPU architectures that support it.

It uses the MetaListPath syntax, and a path comprised of the architecture family name and instruction set name.

It is a compilation error to use the instruction_set attribute on a target that does not support it.

On ARM

For the ARMv4T and ARMv5te architectures, the following are supported:

arm::a32 — Generate the function as A32 “ARM” code.
arm::t32 — Generate the function as T32 “Thumb” code.

#[instruction_set(arm::a32)]
fn foo_arm_code() {}

#[instruction_set(arm::t32)]
fn bar_thumb_code() {}

Using the instruction_set attribute has the following effects:

If the address of the function is taken as a function pointer, the low bit of the address will be set to 0 (arm) or 1 (thumb) depending on the instruction set.
Any inline assembly in the function must use the specified instruction set instead of the target default.