-
Notifications
You must be signed in to change notification settings - Fork 62
Closed
Labels
A-layoutTopic: Related to data structure layout (`#[repr]`)Topic: Related to data structure layout (`#[repr]`)S-writeup-neededStatus: Ready for a writeup and no one is assignedStatus: Ready for a writeup and no one is assigned
Metadata
Metadata
Assignees
Labels
A-layoutTopic: Related to data structure layout (`#[repr]`)Topic: Related to data structure layout (`#[repr]`)S-writeup-neededStatus: Ready for a writeup and no one is assignedStatus: Ready for a writeup and no one is assigned
Type
Projects
Milestone
Relationships
Development
Select code repository
Activity
joshtriplett commentedon Aug 30, 2018
#[repr(C)]
is meaningful (it guarantees that the union will have the same layout as an equivalent C union would); we do need to determine if#[repr(Rust)]
wants to diverge from that, though. (Or if we want to guarantee that it'll never diverge in the future.)Also, some relevant text from C11:
6.5.3.6.6:
6.5.8.5:
6.7.2.1.16:
hanna-kruppe commentedon Aug 30, 2018
AFAIK the first paragraph you quoted (6.5.3.6.6) is solely about strict aliasing/TBAA, which Rust doesn't do. The other two quotes seem to practically guarantee that all members of the union start at the same offset, and offset 0 at that unless the
union foo *
->struct union_member*
cast adjusts the pointer. Does that sound right?joshtriplett commentedon Aug 30, 2018
@rkruppe Yes. (Also, I think the "suitably converted" there exclusively means a type conversion, not a value change.)
nikomatsakis commentedon Sep 5, 2018
Seems like we might as well reserve the right for that, although I don't see much motivation. Maybe we should drill in to some of the more specific questions:
#[repr(Rust)]
unions?I am sort of of tempted to do so, because I don't know that there is much practical utility to doing otherwise, but I'd be curious to hear of use cases.
joshtriplett commentedon Sep 5, 2018
In the interests of full evaluation of alternatives: the only argument I've heard for doing otherwise would be if we could detect that all the variants in a
repr(Rust)
union had "holes" in their representations, and then arrange those representations within the union such as to give the overall union a "hole". We could only do so if we 1) allowed flexibility in representation forrepr(Rust)
unions, and 2) prohibitedrepr(Rust)
unions from containing arbitrary unknown bit patterns not expressed by the field types (whichrepr(C)
unions have to allow).I don't believe we should do either of those things, but I wanted to mention the arguments for doing so for completeness.
hanna-kruppe commentedon Sep 5, 2018
That's an interesting line of thought! However, what could we actually use these "holes" for? I assume you're referring to padding. I'm not aware of any way to stash discriminants or other data in a contained type's padding. Any write or copy is allowed to omit or clobber padding bytes at random. For example, suppose
Result<(u8, u32), u8>
wanted to place the discriminant or the u8 in the padding of the tuple, this breaks as soon as someone takes a mutable reference to the tuple and writes to it.gnzlbg commentedon Sep 5, 2018
Do we want
#[repr(Rust)]
unions? What do they allow ? E.g. I would be fine with just requiring that all unions must be#[repr(C)]
for now, adding a warning for non#[repr(C)]
unions, and maybe in the 2018 edition turning that into an error.We can then, at some point, re-consider adding
#[repr(Rust)]
unions, with suitable motivation. I am not saying that they would be useless, but that if we are going to specify them exactly the same as#[repr(C)]
, we don't need both, and not being able to use them in C-FFI would be a downside ofrepr(Rust)
union w.r.t.repr(C)
here.We don't have to allow all kinds of types for all
repr
s, and doing so makes us waste time in the specification of each repr.joshtriplett commentedon Sep 5, 2018
@gnzlbg We already have
repr(Rust)
unions in stable, and people are actively using them. People want to be able to build the space-efficient data structures they allow, and similar.joshtriplett commentedon Sep 5, 2018
@rkruppe I'm not talking about padding. I'm talking about things like enums and
bool
not using all the bits in their representation.If I have a
repr(Rust)
union of abool
and a three-variant enum, and I then wrap that in anOption
, could that fit in one byte?Again, I don't think that we should do that, but people have suggested doing so.
hanna-kruppe commentedon Sep 5, 2018
Ah, that makes more sense. I also don't think we should do this, though, I'm in favor of the "unions are bags of uninterpreted bits" approach that we seem to be slowly converging on (e.g. with the disposition to merge rust-lang/rfcs#2514).
14 remaining items
gnzlbg commentedon Oct 12, 2018
What is the layout of union variants when the
repr
of the different variants differ? E.g.:Currently,
U.a
appears to have array layout instead ofrepr(simd)
layout.hanna-kruppe commentedon Oct 12, 2018
Memory layout is no different. Calling convention details are different (and that factors into the relatively superficial difference in the IR we produce observed over there in that PR), but I see no reason to specify those for repr(rust) unions, as they are irrelevant outside of FFI which one should use repr(C) for anyway.
RalfJung commentedon Oct 12, 2018
So the distinction between
[...]
and<...>
in LLVM should just affect calling conventions, but it does affect more, and that's the problem in that PR?hanna-kruppe commentedon Oct 12, 2018
No, arrays vs vectors is a quite important distinction for the IR, but none of those differences except ABI lowering affect the sort of visible behavior we are documenting here.
RalfJung commentedon Oct 13, 2018
To summarize the discussion that happened here, the consensus seems to be that
repr(C)
unions have all their fields at offset 0, and forrepr(Rust)
that's likely the most sensible option but we'll have to await the discussion about validity invariants for unions to rule out use-cases like #13 (comment). Any objections?nikomatsakis commentedon Oct 23, 2018
@RalfJung sounds great to me! Do you think you can get a write-up done by this Thursday? Would be good to have somethng by the meeting. =)
RalfJung commentedon Oct 24, 2018
Done: #39
Feels rather short, but what else is there to say?
gnzlbg commentedon Nov 20, 2018
It would be useful to use
union
s likeMaybeUninit<T>
in C FFI, but for thatMaybeUninit<T>
would need to have the samerepr
asT
. Currently we don't have anything likerepr(transparent)
forunion
s, and for the case ofMaybeUninit<T>
which is an union with two variants, it's unclear to me whether something likerepr(transparent)
would work.RalfJung commentedon Jun 29, 2019
Turns out regex relies on
repr(Rust)
unions having their field at offset 0:https://github.com/rust-lang/regex/blob/172898a4fda4fd6a2d1be9fc7b8a0ea971c84ca6/src/vector/ssse3.rs#L80-L83
I bet they are not the only ones...