Closed
Description
e.g. this does not give a MIR borrowck error:
#![feature(thread_local)]
#[thread_local]
static FOO: u8 = 3;
fn assert_static(_t: &'static u8) {}
fn main() {
assert_static(&FOO);
}
This is a problem because, as #17954 shows, thread locals end when the current thread ends. AST borrowck currently makes it an error when thread locals are borrowed for more than the current function.
Metadata
Metadata
Assignees
Labels
Type
Projects
Relationships
Development
No branches or pull requests
Activity
[-]MIR-borrowck: thread locals are allowed to be borrowed for 'static[/-][+]MIR-borrowck: thread locals are unsoundly allowed to be borrowed for 'static[/+]arielb1 commentedon Nov 1, 2017
The same is true for function arguments (see
regions-addr-of-arg
), and for locals in the case of a panic where no destructors run:I'm having all 3 in a single issue because the fix (preventing outstanding borrows to locals at function exit) should catch all 3.
[-]MIR-borrowck: thread locals are unsoundly allowed to be borrowed for 'static[/-][+]MIR-borrowck: locals (and thread locals) can remain borrowed after the function ends[/+]nikomatsakis commentedon Nov 10, 2017
@arielb1 hmm, what exact fix do you have in mind?
With respect to local variables, I think the problem there is that we don't have a proper storage-dead on the unwinding path, no? That is, would it error if you removed the
panic!
? My expectation was that it would report an error when we hit theStorageDead
(either on unwind or not).With respect to thread-locals, we don't have a
StorageDead
, so I guess we need to have a kind of custom check for "when you exit the function, make sure there are no outstanding borrows that have a shallow prefix to a thread-local static".I could imagine trying not to add storage-dead onto the unwind path, but it seems like that might go wrong, since you might have nested scopes that could cause trouble? (i.e., inner variables getting freed before outer ones?)
arielb1 commentedon Nov 13, 2017
@nikomatsakis
We know that all locals have their storage killed at the end of the function, so we don't need to represent that explicitly in MIR. With lexical lifetimes, I suppose you could observe the ordering of
endregion
/storagedead
, but with NLL there wouldn't be a difference.Unless we are planning to deploy MIR borrowck without NLL, that shouldn't be a problem.
nikomatsakis commentedon Nov 13, 2017
@arielb1 so, to be clear, is your expectation that we will not add
StorageDead
onto the unwind path at all, but instead we will rely on some kind of "mass dead" check at function exit?I'm trying to put a finger on what I think that might permit that we might not want to. Given that the checks on StorageDead are a subset of those that occur on DROP, it seems like very little, unless we are talking about structures without destructors -- but those are permitted to have cycles and data that outlives their destruction anyway. So it may be nothing.
arielb1 commentedon Nov 13, 2017
@nikomatsakis
I think it will allow you to have "NLL-style" cycles if your function doesn't have a destructor. aka:
This currently creates a storagedead -> endregion -> storagedead sequence, which causes a "borrow does not live long enough" error, while either with NLL, or if we have a "mass storagedead at function exit", it should compile. However, with the "None terminator", if we add a destructor in scope, the code will start issuing a borrowck error and therefore be inconsistent.
I think one good way to deal with it consistently, if we want a stable MIR borrowck without NLL, would be to introduce a bit of CFG-based analysis to MIR borrowck: if we find a sequence of storagedead and endregion instructions (possibly connected by gotos across basic blocks) end all regions first and then kill all the storage.
nikomatsakis commentedon Nov 13, 2017
I am more and more thinking we will not wind up wanting this -- the timing just doesn't seem to be working out that way.
nikomatsakis commentedon Nov 13, 2017
Mentoring instructions
Well, whatever we do, I think we all agree that it makes sense as a minimal starting point to have some code in borrow-check such that when you return (or unwind) from a function, we report an error if there are any outstanding loans. In other words, any of the "returning terminators" (listed below) are treated as if they kill the storage for all locals.
The "returning terminators" are those that have no successor, with the exception of
Unreachable
:Return
Resume
GeneratorDrop
From the POV of borrow-check, we want to treat those basically the same as if they had done a
StorageDead
for every local, I believe.StorageDead
is the MIR statement that frees the stack space for a local variable. Borrow check currently handles aStorageDead
like so:rust/src/librustc_mir/borrow_check.rs
Lines 291 to 296 in 8efbf7a
As you can see, it invokes
access_lvalue
with the context of a 'shallow write'. The "write" means that it invalidates all the data in the variable -- the "shallow" means that it only invalidates the data stored contiguously in memory, and not the stuff that is reached through a pointer indirection (notably: not borrowed data referenced by the variable).When the borrow check encounters one of our returning terminators, it currently does nothing:
rust/src/librustc_mir/borrow_check.rs
Lines 373 to 380 in 8efbf7a
I think we want to separate out the returning terminators from that match, and then have them iterate over through all the locals in the MIR. The local variables in the MIR are stored in the
local_decls
field. You can iterate through them with something like this:Once that's done, you'll have to edit the tests.
Also, we'll want to handle thread-locals, but we'll come to that later. (Note to future self: thread-locals will need 'deep' style access checks.)
arielb1 commentedon Nov 13, 2017
@nikomatsakis
You also want to end the borrows for all the local (aka
ReScope
) regions when performing the check, otherwise you would get errors in code as obvious asnikomatsakis commentedon Nov 13, 2017
Hmm, I left out something important from those instructions. We also want to modify the bit of code that figures out which borrows are in effect. We need to consider the "return" to cancel all the borrows whose duration are local to the function (an
ReScope
region). This would be modifying the data flow.But I am now wondering if maybe it's better to iterate over the live borrows and examine them, rather than iterating over all the variables and trying to check if accessing them would be legal.
nikomatsakis commentedon Nov 13, 2017
Well, be forewarned, this may be one of those things where we try a few different stabs at it before we are happy. =)
KiChjang commentedon Nov 18, 2017
I'm taking a look at this.