New scoping rules for safe destruction by pnkfelix · Pull Request #21022 · rust-lang/rust (original) (raw)
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
Namely, BlockS
used to carry an fcx: &'blk FunctionContext
, while
the FunctionContext
carries a reference to the arena that holds the
blocks.
This creates a chicken/egg problem where we are attempting to assign
the lifetime 'blk
to both the FunctionContext
and to the arena's
blocks, which does not work under the new lifetime rules for Drop
,
since there has to be some order in which the two are torn down.
To resolve this, I removed the fcx
from the BlockS
, and instead
turn the Block
type (which was formerly &'blk BlockS
) into a struct
carrying both the pointer to the BlockS
as well as the fcx
.
(I often run compiletest
by hand by cut-and-pasting from what make
runs, but then I need to tweak it (cut out options) and its useful to
be told when I have removed an option that is actually required, such
as --android-cross-path=path
.)
note that some of these cases may actually be fixes to latent bugs, in some sense (depending on what our guarantees are about e.g. what a hashmap should be allowed to access in its own destructor).
Added DestructionScope variant to CodeExtent, representing the area immediately surrounding a node that is a terminating_scope (e.g. statements, looping forms) during which the destructors run (the destructors for temporaries from the execution of that node, that is).
insert DestructionScope and block Remainder into enclosing CodeExtents hierarchy.
Switch to constructing DestructionScope rather than Misc in a number
of places, mostly related to ty::ReFree
creation, and use
destruction-scopes of node-ids at various calls to
liberate_late_bound_regions.
middle::resolve_lifetime: Map BlockScope to DestructionScope in fn resolve_free_lifetime
.
add the InnermostDeclaringBlock and InnermostEnclosingExpr enums that are my attempt to clarify the region::Context structure, and that later commmts build upon.
Expanded region-inference graph with enclosing relationship as well as the constraint edges.
improve the debug output for CodeExtent
attached to ty::Region::ReScope
.
loosened an assertion in rustc_trans::trans::cleanup
to account for
DestructionScope
. (Perhaps this should just be switched entirely
over to DestructionScope
, rather than allowing for either Misc
or
DestructionScope
.)
The more fine-grained scopes introduced by this change break some code. A simple
example can be seen in libregex/vm.rs, where two references clist
and nlist
need to have the same lifetime lead to this code breaking:
let mut clist = &mut Threads::new(self.which, ninsts, ncaps);
let mut nlist = &mut Threads::new(self.which, ninsts, ncaps);
...
mem::swap(&mut clist, &mut nlist);
The reason for the breakage is that the thread-value associated with
nlist
has a strictly shorter lifetime than the &mut
-reference for
clist
, but the code in question needs both clist
and list
to be
assigned compatible lifetimes. The usual fix is to revise the code as
follows, moving both of the thread values up above the creation of
the &mut
-references, like so:
let mut cthread = Threads::new(self.which, ninsts, ncaps);
let mut nthread = Threads::new(self.which, ninsts, ncaps);
let mut clist = &mut cthread;
let mut nlist = &mut nthread;
...
mem::swap(&mut clist, &mut nlist);
Likewise, there are other cases where the value needs to have a strictly greater lifetime than the reference taken to that value, and the typical solution is to create the value and bind it to a name in statement preceding the creation of the reference.
Due to these (relatively rare) instances, this is a (wait for it)
[breaking-change]
fix exponential time blowup on compile-fail/huge-struct.rs by keeping the breadcrumbs until end of traversal.
factored drop-checking code out into dropck module.
Adds SafeDestructor
to enum SubregionOrigin
(for error reporting).
"fix" pcwalton changes to avoid premature return from regionck::visit_expr.
includes still more debug instrumentation, as well as a span_note in the case that used to return prematurely.
Since this imposes restrictions on the lifetimes used in types with destructors, this is a (wait for it)
[breaking-change]
more regression tests extracted while bootstrapping.
Since the earlier commits impose rules on lifetimes that make
destructors safe, we no longer need the #[unsafe_destructor]
attribute nor its associated check.
Since this is removing a (somewhat common albeit unsafe) attribute, this is a (wait for it)
[breaking-change]
This was referenced
Jan 15, 2015
This was referenced
Jan 26, 2015
bors added a commit that referenced this pull request
Add CodeExtent::Remainder
variant; pre-req for new scoping/drop rules.
This new enum variant introduces finer-grain code extents, i.e. we now track that a binding lives only for a suffix of a block, and (importantly) will be dropped when it goes out of scope before the bindings that occurred earlier in the block.
Both of these notions are neatly captured by marking the block (and each suffix) as an enclosing scope of the next suffix beneath it.
This is work that is part of the foundation for issue #8861.
(It actually has been seen in earlier posted pull requests, in particular #21022; I have just factored it out into its own PR to ease my own near-future rebasing, and also get people used to the new rules.)
These finer grained scopes do mean that some code is newly rejected by rustc
; for example:
let mut map : HashMap<u8, &u8> = HashMap::new();
let tmp = Box::new(2);
map.insert(43, &*tmp);
This will now fail to compile with a message that *tmp
does not live long enough, because the scope of tmp
is now strictly smaller than
that of map
, and the use of &u8
in map's type requires that the borrowed references are all to data that live at least as long as the map.
The usual fix for a case like this is to move the binding for tmp
up above that of map
; note that you can still leave the initialization in the original spot, like so:
let tmp;
let mut map : HashMap<u8, &u8> = HashMap::new();
tmp = box 2;
map.insert(43, &*tmp);
Similarly, one can encounter an analogous situation with Vec
: one would need to rewrite:
let mut vec = Vec::new();
let tmp = 'c';
vec.push(&tmp);
as:
let tmp;
let mut vec = Vec::new();
tmp = 'c';
vec.push(&tmp);
In some corner cases, it does not suffice to reorder the bindings; in particular, when the types for both bindings need to reflect exactly the same code extent, and a parent/child relationship between them does not work.
In pnkfelix's experience this has arisen most often when mixing uses of cyclic data structures while also allowing a lifetime parameter 'a
to flow into a type parameter context where the type is invariant with respect to the type parameter. An important instance of this is arena::TypedArena<T>
, which is invariant with respect to T
.
(The reason that variance is relevant is this: if TypedArena
were covariant with respect to its type parameter, then we could assign it
the longer lifetime when it is initialized, and then convert it to a subtype (via covariance) with a shorter lifetime when necessary. But TypedArena
is invariant with respect to its type parameter, and thus if S
is a subtype of T
(in particular, if S
has a lifetime parameter that is shorter than that of T
), then a TypedArena<S>
is unrelated to TypedArena<T>
.)
Concretely, consider code like this:
struct Node<'a> { sibling: Option<&'a Node<'a>> }
struct Context<'a> {
// because of this field, `Context<'a>` is invariant with respect to `'a`.
arena: &'a TypedArena<Node<'a>>,
...
}
fn new_ctxt<'a>(arena: &'a TypedArena<Node<'a>>) -> Context<'a> { ... }
fn use_ctxt<'a>(fcx: &'a Context<'a>) { ... }
let arena = TypedArena::new();
let ctxt = new_ctxt(&arena);
use_ctxt(&ctxt);
In these situations, if you try to introduce two bindings via two distinct let
statements, each is (with this commit) assigned a distinct extent, and the region inference system cannot find a single region to assign to the lifetime 'a
that works for both of the bindings. So you get an error that ctxt
does not live long enough; but moving its binding up above that of arena
just shifts the error so now the compiler complains that arena
does not live long enough.
- SO: What to do? The easiest fix in this case is to ensure that the two bindings do get assigned the same static extent, by stuffing both bindings into the same let statement, like so:
let (arena, ctxt): (TypedArena, Context);
arena = TypedArena::new();
ctxt = new_ctxt(&arena);
use_ctxt(&ctxt);
Due to the new code restrictions outlined above, this is a ...
[breaking-change]