Skip to content

validity invariant for types #8

@nikomatsakis

Description

@nikomatsakis
Contributor

@RalfJung introduced the idea of validity invariants in their blog post "Two kinds of invariants". Presuming we agree with this framing (I do), we need to define these validity invariants.

These invariants need to justify the various sorts of optimizations that we currently do:

  • For example, Option<&T> layout optimization
  • Marking pointers as deferenceable

We need to discuss also the role of uninitialized memory and how it fits in. For example, when are invariants required to hold? Only when "compiler thinks memory is initialized" -- can/should we make that more precise? Also, what about loads of uninitialized integral values (a sometimes useful hack) -- are those valid? What is the effect?

Activity

RalfJung

RalfJung commented on Oct 9, 2018

@RalfJung
Member

Noting it here so we don't forget about it:

  • We will likely say that references have to be aligned to be valid; what does "aligned" mean for an extern type where we don't know the alignment?
  • For fat raw pointers, do we require the metadata to make enough sense to determine size and alignment?
RalfJung

RalfJung commented on Nov 14, 2018

@RalfJung
Member

As but two of many examples for people using &mut to uninitialized data:

(Collecting this here so I will find the links when the discussion starts.)

RalfJung

RalfJung commented on Nov 17, 2018

@RalfJung
Member
RalfJung

RalfJung commented on Nov 24, 2018

@RalfJung
Member

Related discussion about which invariants we require to hold on &*x: rust-lang/rust#56190

RalfJung

RalfJung commented on Nov 25, 2018

@RalfJung
Member

str is an interesting case: It usually only exists behind a ptr indirection, so an &str to non-UTF8 data would not be UB if references do not have to be transitively valid.

Maybe the UTF-8 thing is just a safety invariant, never a validity invariant?

gnzlbg

gnzlbg commented on Nov 26, 2018

@gnzlbg
Contributor

@RalfJung I think that, ideally, for DSTs, validity should just be specified for custom DSTs. That is, the validity of trait objects, slices, and str, should follow from that.

We don't have custom DSTs yet, so I think it is ok to also consider validity of trait objects and slices, and in the future, when custom DSTs are available, try to reconcile those into a more general framework.

Currently, str is an special DST in Rust that's part of the language, but I think it might be worth it to consider it as a custom DST that's just defined as struct str([u8]) so that its validity would follow from the validity of slices. This is how I expect AsciiStr to be implemented as well once we have custom DSTs, and I think it would be weird for str to be more special than it. Also, CStr would be a custom DST as well.

Thinking more into the future, if str were to be implemented in core as a custom DST (@rkruppe discovered: rust-lang/rust#19612), we would still need to make it a lang item, which would still enable the compiler to do specific str optimizations. I don't know if the compiler is already doing some, and it is unclear to me how much value these add. But the requirement that "constructing a &str to a non-UTF8 string is undefined behavior" appears to me to be a requirement that, while checkable on miri, I don't think is checkable in general. For example, with custom DSTs I could have a slice containing only prime numbers: &PrimeNumberSlice. If constructing one that points to non-prime numbers is undefined behavior, then miri would have to detect it at construction point "somehow".

So I like to think that the UTF8 requirement on &str is more like an user-defined pre-condition on some of the &str methods. For example, str::len(&self) returns the length of the &str in bytes, so this method does not necessarily have to require the contents of str to be valid UTF-8 to work "correctly". Creating a str that does not point to an UTF8 string, and then calling some methods whose pre-conditions are violated, could lead to undefined behavior down the road (e.g. a null pointer dereference, read out-of-bounds, etc.) but I don't think it should be a new special type of undefined behavior per se.

RalfJung

RalfJung commented on Nov 27, 2018

@RalfJung
Member

For example, with custom DSTs I could have a slice containing only prime numbers: &PrimeNumberSlice. If constructing one that points to non-prime numbers is undefined behavior, then miri would have to detect it at construction point "somehow".

Sure, custom DSTs come with custom invariants that rustc and miri do not know. These are safety invariants, not validity invariants.

What you are saying is that we could make UTF-8 a mere safety invariant for str. I agree. The open question is whether that reflects reality, and what it is that we want to do.

RalfJung

RalfJung commented on Dec 3, 2018

@RalfJung
Member

ManuallyDrop is an interesting case. It certainly has to be at least "bitstring-valid" (using @asajeffrey's term), does it have to be anything more? Maybe we should, after all, separate the things references say about their pointee from the rest of validity?

gnzlbg

gnzlbg commented on Dec 3, 2018

@gnzlbg
Contributor
RalfJung

RalfJung commented on Dec 3, 2018

@RalfJung
Member

We want (and have) ManuallyDrop<T: ?Sized>, while we are unlikely to get unsized unions any time soon.

RalfJung

RalfJung commented on Dec 3, 2018

@RalfJung
Member

Basically, ManuallyDrop must still be valid because we must still be able to determine its size.

gnzlbg

gnzlbg commented on Dec 3, 2018

@gnzlbg
Contributor
JakobDegen

JakobDegen commented on May 23, 2023

@JakobDegen
Contributor

Closing. There are lots of open questions here, almost all of which are in other issues. This one doesn't seem useful

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-validityTopic: Related to validity invariants

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @nikomatsakis@RalfJung@gnzlbg@JakobDegen

        Issue actions

          validity invariant for types · Issue #8 · rust-lang/unsafe-code-guidelines