Skip to content

MIR-borrowck: locals (and thread locals) can remain borrowed after the function ends #45704

Closed
@arielb1

Description

@arielb1
Contributor

e.g. this does not give a MIR borrowck error:

#![feature(thread_local)]

#[thread_local]
static FOO: u8 = 3;

fn assert_static(_t: &'static u8) {}
fn main() {
     assert_static(&FOO);
}

This is a problem because, as #17954 shows, thread locals end when the current thread ends. AST borrowck currently makes it an error when thread locals are borrowed for more than the current function.

Activity

added
I-unsoundIssue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness
on Nov 1, 2017
changed the title [-]MIR-borrowck: thread locals are allowed to be borrowed for 'static[/-] [+]MIR-borrowck: thread locals are unsoundly allowed to be borrowed for 'static[/+] on Nov 1, 2017
arielb1

arielb1 commented on Nov 1, 2017

@arielb1
ContributorAuthor

The same is true for function arguments (see regions-addr-of-arg), and for locals in the case of a panic where no destructors run:

fn cplusplus_mode(x: isize) -> &'static isize { &x }

fn cplusplus_mode_exceptionally_unsafe(x: &mut Option<&'static mut isize>) {
    let z = 0;
    *x = Some(&mut z);
    panic!("catch me for a dangling pointer!")
}

I'm having all 3 in a single issue because the fix (preventing outstanding borrows to locals at function exit) should catch all 3.

changed the title [-]MIR-borrowck: thread locals are unsoundly allowed to be borrowed for 'static[/-] [+]MIR-borrowck: locals (and thread locals) can remain borrowed after the function ends[/+] on Nov 1, 2017
nikomatsakis

nikomatsakis commented on Nov 10, 2017

@nikomatsakis
Contributor

@arielb1 hmm, what exact fix do you have in mind?

With respect to local variables, I think the problem there is that we don't have a proper storage-dead on the unwinding path, no? That is, would it error if you removed the panic!? My expectation was that it would report an error when we hit the StorageDead (either on unwind or not).

With respect to thread-locals, we don't have a StorageDead, so I guess we need to have a kind of custom check for "when you exit the function, make sure there are no outstanding borrows that have a shallow prefix to a thread-local static".

I could imagine trying not to add storage-dead onto the unwind path, but it seems like that might go wrong, since you might have nested scopes that could cause trouble? (i.e., inner variables getting freed before outer ones?)

arielb1

arielb1 commented on Nov 13, 2017

@arielb1
ContributorAuthor

@nikomatsakis

We know that all locals have their storage killed at the end of the function, so we don't need to represent that explicitly in MIR. With lexical lifetimes, I suppose you could observe the ordering of endregion/storagedead, but with NLL there wouldn't be a difference.

Unless we are planning to deploy MIR borrowck without NLL, that shouldn't be a problem.

nikomatsakis

nikomatsakis commented on Nov 13, 2017

@nikomatsakis
Contributor

@arielb1 so, to be clear, is your expectation that we will not add StorageDead onto the unwind path at all, but instead we will rely on some kind of "mass dead" check at function exit?

I'm trying to put a finger on what I think that might permit that we might not want to. Given that the checks on StorageDead are a subset of those that occur on DROP, it seems like very little, unless we are talking about structures without destructors -- but those are permitted to have cycles and data that outlives their destruction anyway. So it may be nothing.

arielb1

arielb1 commented on Nov 13, 2017

@arielb1
ContributorAuthor

@nikomatsakis

I think it will allow you to have "NLL-style" cycles if your function doesn't have a destructor. aka:

use std::cell::Cell;

// no destructor
struct S<'a>(Cell<Option<&'a S<'a>>>);
fn main() {
    #[cfg(break)] let _d = Box::new(0);
    let x = S(Cell::new(None));
    let y = S(Cell::new(None));
    x.0.set(Some(&y));
    y.0.set(Some(&x));
    panic!()
}

This currently creates a storagedead -> endregion -> storagedead sequence, which causes a "borrow does not live long enough" error, while either with NLL, or if we have a "mass storagedead at function exit", it should compile. However, with the "None terminator", if we add a destructor in scope, the code will start issuing a borrowck error and therefore be inconsistent.

I think one good way to deal with it consistently, if we want a stable MIR borrowck without NLL, would be to introduce a bit of CFG-based analysis to MIR borrowck: if we find a sequence of storagedead and endregion instructions (possibly connected by gotos across basic blocks) end all regions first and then kill all the storage.

nikomatsakis

nikomatsakis commented on Nov 13, 2017

@nikomatsakis
Contributor

if we want a stable MIR borrowck without NLL

I am more and more thinking we will not wind up wanting this -- the timing just doesn't seem to be working out that way.

nikomatsakis

nikomatsakis commented on Nov 13, 2017

@nikomatsakis
Contributor

Mentoring instructions

Well, whatever we do, I think we all agree that it makes sense as a minimal starting point to have some code in borrow-check such that when you return (or unwind) from a function, we report an error if there are any outstanding loans. In other words, any of the "returning terminators" (listed below) are treated as if they kill the storage for all locals.

The "returning terminators" are those that have no successor, with the exception of Unreachable:

From the POV of borrow-check, we want to treat those basically the same as if they had done a StorageDead for every local, I believe. StorageDead is the MIR statement that frees the stack space for a local variable. Borrow check currently handles a StorageDead like so:

StatementKind::StorageDead(local) => {
self.access_lvalue(ContextKind::StorageDead.new(location),
(&Lvalue::Local(local), span),
(Shallow(None), Write(WriteKind::StorageDead)),
flow_state);
}

As you can see, it invokes access_lvalue with the context of a 'shallow write'. The "write" means that it invalidates all the data in the variable -- the "shallow" means that it only invalidates the data stored contiguously in memory, and not the stuff that is reached through a pointer indirection (notably: not borrowed data referenced by the variable).

When the borrow check encounters one of our returning terminators, it currently does nothing:

TerminatorKind::Goto { target: _ } |
TerminatorKind::Resume |
TerminatorKind::Return |
TerminatorKind::GeneratorDrop |
TerminatorKind::Unreachable |
TerminatorKind::FalseEdges { .. } => {
// no data used, thus irrelevant to borrowck
}

I think we want to separate out the returning terminators from that match, and then have them iterate over through all the locals in the MIR. The local variables in the MIR are stored in the local_decls field. You can iterate through them with something like this:

for local in self.mir.local_decls.indices() {
    // insert call to access_lvalue() we saw before in StorageDead here
}

Once that's done, you'll have to edit the tests.

Also, we'll want to handle thread-locals, but we'll come to that later. (Note to future self: thread-locals will need 'deep' style access checks.)

added
E-mentorCall for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion.
on Nov 13, 2017
arielb1

arielb1 commented on Nov 13, 2017

@arielb1
ContributorAuthor

@nikomatsakis

You also want to end the borrows for all the local (aka ReScope) regions when performing the check, otherwise you would get errors in code as obvious as

let x = 2;
let y = &x;
panic!()
nikomatsakis

nikomatsakis commented on Nov 13, 2017

@nikomatsakis
Contributor

Hmm, I left out something important from those instructions. We also want to modify the bit of code that figures out which borrows are in effect. We need to consider the "return" to cancel all the borrows whose duration are local to the function (an ReScope region). This would be modifying the data flow.

But I am now wondering if maybe it's better to iterate over the live borrows and examine them, rather than iterating over all the variables and trying to check if accessing them would be legal.

nikomatsakis

nikomatsakis commented on Nov 13, 2017

@nikomatsakis
Contributor

Well, be forewarned, this may be one of those things where we try a few different stabs at it before we are happy. =)

added this to the NLL prototype milestone on Nov 15, 2017
KiChjang

KiChjang commented on Nov 18, 2017

@KiChjang
Member

I'm taking a look at this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-borrow-checkerArea: The borrow checkerE-mentorCall for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion.I-unsoundIssue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @nikomatsakis@arielb1@KiChjang

        Issue actions

          MIR-borrowck: locals (and thread locals) can remain borrowed after the function ends · Issue #45704 · rust-lang/rust