std::ptr - Rust (original) (raw)

Expand description

Manually manage memory through raw pointers.

See also the pointer primitive types.

§Safety

Many functions in this module take raw pointers as arguments and read from or write to them. For this to be safe, these pointers must be valid for the given access. Whether a pointer is valid depends on the operation it is used for (read or write), and the extent of the memory that is accessed (i.e., how many bytes are read/written) – it makes no sense to ask “is this pointer valid”; one has to ask “is this pointer valid for a given access”. Most functions use *mut Tand *const T to access only a single value, in which case the documentation omits the size and implicitly assumes it to be size_of::<T>() bytes.

The precise rules for validity are not determined yet. The guarantees that are provided at this point are very minimal:

These axioms, along with careful use of offset for pointer arithmetic, are enough to correctly implement many useful things in unsafe code. Stronger guarantees will be provided eventually, as the aliasing rules are being determined. For more information, see the book as well as the section in the reference devoted to undefined behavior.

We say that a pointer is “dangling” if it is not valid for any non-zero-sized accesses. This means out-of-bounds pointers, pointers to freed memory, null pointers, and pointers created withNonNull::dangling are all dangling.

§Alignment

Valid raw pointers as defined above are not necessarily properly aligned (where “proper” alignment is defined by the pointee type, i.e., *const T must be aligned to mem::align_of::<T>()). However, most functions require their arguments to be properly aligned, and will explicitly state this requirement in their documentation. Notable exceptions to this areread_unaligned and write_unaligned.

When a function requires proper alignment, it does so even if the access has size 0, i.e., even if memory is not actually touched. Consider usingNonNull::dangling in such cases.

§Pointer to reference conversion

When converting a pointer to a reference (e.g. via &*ptr or &mut *ptr), there are several rules that must be followed:

If a pointer follows all of these rules, it is said to be_convertible to a (mutable or shared) reference_.

These rules apply even if the result is unused! (The part about being initialized is not yet fully decided, but until it is, the only safe approach is to ensure that they are indeed initialized.)

An example of the implications of the above rules is that an expression such as unsafe { &*(0 as *const u8) } is Immediate Undefined Behavior.

§Allocated object

An allocated object is a subset of program memory which is addressable from Rust, and within which pointer arithmetic is possible. Examples of allocated objects include heap allocations, stack-allocated variables, statics, and consts. The safety preconditions of some Rust operations - such as offset and field projections (expr.field) - are defined in terms of the allocated objects on which they operate.

An allocated object has a base address, a size, and a set of memory addresses. It is possible for an allocated object to have zero size, but such an allocated object will still have a base address. The base address of an allocated object is not necessarily unique. While it is currently the case that an allocated object always has a set of memory addresses which is fully contiguous (i.e., has no “holes”), there is no guarantee that this will not change in the future.

For any allocated object with base address, size, and a set ofaddresses, the following are guaranteed:

As a consequence of these guarantees, given any address a within the set of addresses of an allocated object:

§Provenance

Pointers are not simply an “integer” or “address”. For instance, it’s uncontroversial to say that a Use After Free is clearly Undefined Behavior, even if you “get lucky” and the freed memory gets reallocated before your read/write (in fact this is the worst-case scenario, UAFs would be much less concerning if this didn’t happen!). As another example, consider that wrapping_offset is documented to “remember” the allocated object that the original pointer points to, even if it is offset far outside the memory range occupied by that allocated object. To rationalize claims like this, pointers need to somehow be more than just their addresses: they must have provenance.

A pointer value in Rust semantically contains the following information:

The exact structure of provenance is not yet specified, but the permission defined by a pointer’s provenance have a spatial component, a temporal component, and a _mutability_component:

When an allocated object is created, it has a unique Original Pointer. For alloc APIs this is literally the pointer the call returns, and for local variables and statics, this is the name of the variable/static. (This is mildly overloading the term “pointer” for the sake of brevity/exposition.)

The Original Pointer for an allocated object has provenance that constrains the _spatial_permissions of this pointer to the memory range of the allocation, and the _temporal_permissions to the lifetime of the allocation. Provenance is implicitly inherited by all pointers transitively derived from the Original Pointer through operations like offset, borrowing, and pointer casts. Some operations may shrink the permissions of the derived provenance, limiting how much memory it can access or how long it’s valid for (i.e. borrowing a subfield and subslicing can shrink the spatial component of provenance, and all borrowing can shrink the temporal component of provenance). However, no operation can ever grow the permissions of the derived provenance: even if you “know” there is a larger allocation, you can’t derive a pointer with a larger provenance. Similarly, you cannot “recombine” two contiguous provenances back into one (i.e. with a fn merge(&[T], &[T]) -> &[T]).

A reference to a place always has provenance over at least the memory that place occupies. A reference to a slice always has provenance over at least the range that slice describes. Whether and when exactly the provenance of a reference gets “shrunk” to exactly fit the memory it points to is not yet determined.

A shared reference only ever has provenance that permits reading from memory, and never permits writes, except inside UnsafeCell.

Provenance can affect whether a program has undefined behavior:

But it is still sound to:

Note that the full definition of provenance in Rust is not decided yet, as this interacts with the as-yet undecided aliasing rules.

§Pointers Vs Integers

From this discussion, it becomes very clear that a usize cannot accurately represent a pointer, and converting from a pointer to a usize is generally an operation which only extracts the address. Converting this address back into pointer requires somehow answering the question: which provenance should the resulting pointer have?

Rust provides two ways of dealing with this situation: Strict Provenance and Exposed Provenance.

Note that a pointer can represent a usize (via without_provenance), so the right type to use in situations where a value is “sometimes a pointer and sometimes a bare usize” is a pointer type.

§Strict Provenance

“Strict Provenance” refers to a set of APIs designed to make working with provenance more explicit. They are intended as substitutes for casting a pointer to an integer and back.

Entirely avoiding integer-to-pointer casts successfully side-steps the inherent ambiguity of that operation. This benefits compiler optimizations, and it is pretty much a requirement for using tools like Miri and architectures like CHERI that aim to detect and diagnose pointer misuse.

The key insight to making programming without integer-to-pointer casts at all viable is thewith_addr method:

    /// Creates a new pointer with the given address.
    ///
    /// This performs the same operation as an `addr as ptr` cast, but copies
    /// the *provenance* of `self` to the new pointer.
    /// This allows us to dynamically preserve and propagate this important
    /// information in a way that is otherwise impossible with a unary cast.
    ///
    /// This is equivalent to using `wrapping_offset` to offset `self` to the
    /// given address, and therefore has all the same capabilities and restrictions.
    pub fn with_addr(self, addr: usize) -> Self;

So you’re still able to drop down to the address representation and do whatever clever bit tricks you want as long as you’re able to keep around a pointer into the allocation you care about that can “reconstitute” the provenance. Usually this is very easy, because you only are taking a pointer, messing with the address, and then immediately converting back to a pointer. To make this use case more ergonomic, we provide the map_addr method.

To help make it clear that code is “following” Strict Provenance semantics, we also provide anaddr method which promises that the returned address is not part of a pointer-integer-pointer roundtrip. In the future we may provide a lint for pointer<->integer casts to help you audit if your code conforms to strict provenance.

§Using Strict Provenance

Most code needs no changes to conform to strict provenance, as the only really concerning operation is casts from usize to a pointer. For code which does cast a usize to a pointer, the scope of the change depends on exactly what you’re doing.

In general, you just need to make sure that if you want to convert a usize address to a pointer and then use that pointer to read/write memory, you need to keep around a pointer that has sufficient provenance to perform that read/write itself. In this way all of your casts from an address to a pointer are essentially just applying offsets/indexing.

This is generally trivial to do for simple cases like tagged pointers as long as you represent the tagged pointer as an actual pointer and not a usize. For instance:

unsafe {
    // A flag we want to pack into our pointer
    static HAS_DATA: usize = 0x1;
    static FLAG_MASK: usize = !HAS_DATA;

    // Our value, which must have enough alignment to have spare least-significant-bits.
    let my_precious_data: u32 = 17;
    assert!(core::mem::align_of::<u32>() > 1);

    // Create a tagged pointer
    let ptr = &my_precious_data as *const u32;
    let tagged = ptr.map_addr(|addr| addr | HAS_DATA);

    // Check the flag:
    if tagged.addr() & HAS_DATA != 0 {
        // Untag and read the pointer
        let data = *tagged.map_addr(|addr| addr & FLAG_MASK);
        assert_eq!(data, 17);
    } else {
        unreachable!()
    }
}

(Yes, if you’ve been using AtomicUsize for pointers in concurrent datastructures, you should be using AtomicPtr instead. If that messes up the way you atomically manipulate pointers, we would like to know why, and what needs to be done to fix it.)

Situations where a valid pointer must be created from just an address, such as baremetal code accessing a memory-mapped interface at a fixed address, cannot currently be handled with strict provenance APIs and should use exposed provenance.

§Exposed Provenance

As discussed above, integer-to-pointer casts are not possible with Strict Provenance APIs. This is by design: the goal of Strict Provenance is to provide a clear specification that we are confident can be formalized unambiguously and can be subject to precise formal reasoning. Integer-to-pointer casts do not (currently) have such a clear specification.

However, there exist situations where integer-to-pointer casts cannot be avoided, or where avoiding them would require major refactoring. Legacy platform APIs also regularly assume that usize can capture all the information that makes up a pointer. Bare-metal platforms can also require the synthesis of a pointer “out of thin air” without anywhere to obtain proper provenance from.

Rust’s model for dealing with integer-to-pointer casts is called Exposed Provenance. However, the semantics of Exposed Provenance are on much less solid footing than Strict Provenance, and at this point it is not yet clear whether a satisfying unambiguous semantics can be defined for Exposed Provenance. (If that sounds bad, be reassured that other popular languages that provide integer-to-pointer casts are not faring any better.) Furthermore, Exposed Provenance will not work (well) with tools like Miri and CHERI.

Exposed Provenance is provided by the expose_provenance and with_exposed_provenance methods, which are equivalent to as casts between pointers and integers.

If at all possible, we encourage code to be ported to Strict Provenance APIs, thus avoiding the need for Exposed Provenance. Maximizing the amount of such code is a major win for avoiding specification complexity and to facilitate adoption of tools like CHERI and Miri that can be a big help in increasing the confidence in (unsafe) Rust code. However, we acknowledge that this is not always possible, and offer Exposed Provenance as a way to explicit “opt out” of the well-defined semantics of Strict Provenance, and “opt in” to the unclear semantics of integer-to-pointer casts.

addr_of

Creates a const raw pointer to a place, without creating an intermediate reference.

addr_of_mut

Creates a mut raw pointer to a place, without creating an intermediate reference.

NonNull

*mut T but non-zero and covariant.

AlignmentExperimental

A type storing a usize which is a power of two, and thus represents a possible alignment in the Rust abstract machine.

DynMetadataExperimental

The metadata for a Dyn = dyn SomeTrait trait object type.

PointeeExperimental

Provides the pointer metadata type of any pointed-to type.

addr_eq

Compares the addresses of the two pointers for equality, ignoring any metadata in fat pointers.

copy

Copies count * size_of::<T>() bytes from src to dst. The source and destination may overlap.

copy_nonoverlapping

Copies count * size_of::<T>() bytes from src to dst. The source and destination must not overlap.

dangling

Creates a new pointer that is dangling, but non-null and well-aligned.

dangling_mut

Creates a new pointer that is dangling, but non-null and well-aligned.

drop_in_place

Executes the destructor (if any) of the pointed-to value.

eq

Compares raw pointers for equality.

fn_addr_eq

Compares the addresses of the two function pointers for equality.

from_mut

Converts a mutable reference to a raw pointer.

from_ref

Converts a reference to a raw pointer.

hash

Hash a raw pointer.

null

Creates a null raw pointer.

null_mut

Creates a null mutable raw pointer.

read

Reads the value from src without moving it. This leaves the memory in src unchanged.

read_unaligned

Reads the value from src without moving it. This leaves the memory in src unchanged.

read_volatile

Performs a volatile read of the value from src without moving it. This leaves the memory in src unchanged.

replace

Moves src into the pointed dst, returning the previous dst value.

slice_from_raw_parts

Forms a raw slice from a pointer and a length.

slice_from_raw_parts_mut

Forms a raw mutable slice from a pointer and a length.

swap

Swaps the values at two mutable locations of the same type, without deinitializing either.

swap_nonoverlapping

Swaps count * size_of::<T>() bytes between the two regions of memory beginning at x and y. The two regions must not overlap.

with_exposed_provenance

Converts an address back to a pointer, picking up some previously ‘exposed’provenance.

with_exposed_provenance_mut

Converts an address back to a mutable pointer, picking up some previously ‘exposed’provenance.

without_provenance

Creates a pointer with the given address and no provenance.

without_provenance_mut

Creates a pointer with the given address and no provenance.

write

Overwrites a memory location with the given value without reading or dropping the old value.

write_bytes

Sets count * size_of::<T>() bytes of memory starting at dst toval.

write_unaligned

Overwrites a memory location with the given value without reading or dropping the old value.

write_volatile

Performs a volatile write of a memory location with the given value without reading or dropping the old value.

from_raw_partsExperimental

Forms a (possibly-wide) raw pointer from a data pointer and metadata.

from_raw_parts_mutExperimental

Performs the same functionality as from_raw_parts, except that a raw *mut pointer is returned, as opposed to a raw *const pointer.

metadataExperimental

Extracts the metadata component of a pointer.