const-eval interning: get rid of type-driven traversal by RalfJung · Pull Request #119044 · rust-lang/rust (original) (raw)
To give some context, the only non-trivial job the interner has is to decide with which mutability the allocations found in the just-evaluated constant should be interned. Allocations generally start out mutable (after all, they are uninitialized at first and then have their initial value written into them during const-eval, which requires them to be mutable), but we want to put them in read-only memory in the final binary if possible.
Usually there is only a single allocation to be interned: the one that holds the final value of the constant. However, there is some code we accept where there is more than one allocation to intern. For instance:
const CONST_VEC: &Vec = &Vec::new();
There are two allocations to intern here, the one that stores the Vec
and the one that stores the reference to the Vec
. We sometimes call these secondary allocations that also need interning "inner" allocations.
(This is not promotion! Vec
does not get promoted since it has a destructor. Instead this is the "outer scope rule" that gives certain expressions the lifetime of the "outer" scope. For const/static, the "outer" scope is 'static
, i.e., this implicitly creates another global. This seems to be somewhat accidental, or at least it was never properly specified or discussed to my knowledge. It also causes some subtle bugs since we do not have a system in place to manage the identity of these globals.)
Old interner
The old interner works as follows: we do a type-based traversal of the final value of the const. That means we recursively go through fields of structs, tuples, enums, ... and when we encounter a reference we also descend into that.
During this descend, we keep track of the current mutability. We start out immutable (except for static mut
where we start mutable). When we go into an UnsafeCell
we switch to mutable. When we encounter a shared reference we switch to immutable. When we hit a reference to a not-yet-interned allocation, we use the current mutability plus a Freeze
check to determine whether to intern that as mutable or immutable.
We also keep track of any other pointer values we find that are not references (raw pointers, things in union fields, and in theory there could also be pointers stored in padding between fields and similar nasty things). These are the "leftover" pointers. In static
, we intern everything from that list as mutable; in const
, we raise an error if there are any leftover pointers.
New interner
The new internet is a lot simpler. Our goal with static
and const
is to intern everything as immutable, except possibly the original allocation storing the value itself: a static A: AtomicUisze
must be put in mutable memory, of course. We just have to verify that this is sound. To this end we recursively traverse all pointers that we find (this doesn't require type-based traversal, pointers are special magic values and we can easily find them all -- we just don't know their type). All pointers to newly interned "inner" allocations must be immutable, i.e., they must be derived from a shared reference without interior mutability.
This relies on #118324, where we now track at const-eval time, using provenance, whether a pointer is mutable or not. When evaluating an &<place>
expression, if the pointee is Freeze
, we set that pointer as immutable, and all pointers derived from this inherit that bit.
This allows us to reject, for instance
const NOT_ACTUALLY_CONST_VEC: *const Vec = &mut Vec::new() as *mut _ as *const _;
We have to reject that code since if we accepted it, arguably we should say that NOT_ACTUALLY_CONST_VEC.cast_mut().write(...)
is allowed -- there's nothing that would cause UB here, the pointer we are writing to was created via &mut
after all. But we don't want const
to implicitly create global mutable state.
Changes in behavior
This example used to be rejected due to leftover pointers:
const CONST_RAW: *const Vec = &Vec::new() as *const _;
The concept of a leftover pointer no longer exists, so we no longer reject this.
Note that this is still rejected:
const CONST_RAW: *const Vec = ptr::addr_of!(Vec::new());
Raw pointers to temporaries are not allowed; only references. (This is not even a const-check I think, but a more general rule.)
I am not aware of any code that we currently accept, that would get rejected after this change. We already refuse all code that would cause inner allocations with interior mutability to be created. However, the checks for that are a bit scattered and indirect; the new interner serves as a second line of defense to ensure that we really do not ever accept such code. We have "miri unleashed" tests to ensure this. (These tests disable many of the usual const checks, accepting basically anything in a const fn
that the interpreter is able to execute.)
There might be things you can do with const_mut_refs
or const_refs_to_cell
that would be rejected now, but those are unstable features and we want to restrict them to ensure that nothing mutable leaks into the final value. Anything we currently accept there would be a bug. Closing any holes that we might currently have in these unstable features is the main motivation for this PR.