Skip to content

make packed struct always use a single backing integer, inferring it if not explicitly provided #10113

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ikskuh opened this issue Nov 7, 2021 · 35 comments
Labels
accepted This proposal is planned. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@ikskuh
Copy link
Contributor

ikskuh commented Nov 7, 2021

So, everyone knows the problems we have with packed structs. Some are implementation problems, but we also have a huge problem with not having a proper specification on how they should work at all, except for "zero padding".

So i want to propose a definition for packed structs that is easy to understand and implement.

Let's consider this code example:

const T = packed struct {
    a: u4,
    b: u4,
};

var bits = [_]u8 { 0x01 };
var t = @bitCast(T, bits);

std.debug.print("{}\n", .{ t });

What do you expect it to print?
Is it T{ .a = 1, .b = 0 }? Then you have assumed a little-endian platform, as on big endian, it will print T{ .a = 0, .b = 1 }.
Check it out on Compiler Explorer.

Personally, i find this confusing, thus i propose:

A packed struct will have a similar semantic as a unsigned integer with the same amount of bits. Each field is considered an unsigned integer of the same amount of bits.

To explain the idea, we have this struct:

const T1 = packed struct { // u8
  a: u4, // occupies bits 0..3
  b: u4, // occupies bits 4..7
};

const T2 = packed struct { // u32
  a: u4, // occupies bits 0..3 in our u32
  b: u3, // occupies bits 4..6 in our u32
  c: u25, // occupies bits 7..31 in our u32
};

So the packed struct implements exactly what we would do in a language without packed structs:

var value: T2 = …;
var a = @truncate( u4, 0x0000000F & (value >> 0));
var b = @truncate( u3, 0x00000007 & (value >> 4));
var c = @truncate(u25, 0x01FFFFFF & (value >> 7));

This means that we have a predictable model for backed structs and in case of T2, we can reason about it:

const integer: u32 = 0x165652B6;
const value = @bitCast(T2, integer);
std.debug.assert(value.a == 6);
std.debug.assert(value.b == 3);
std.debug.assert(value.c == 0x2CACA5);

This will be true for both little and big endian hardware, so it will do what we expect, even if the bytes on disk have a different order.
If a known byte order is required, we can use std.mem.writeIntLittle, @byteSwap and others.

In addition, we can declare a struct as packed struct(u32), which will enforce the struct size to 32 bit, and we will get a compiler error if it isn't the case.

This will also guarantee that this struct is handled as a u32 and can be used as such, similar to enums. This also guarantees that loads and stores will not be sliced into more than one access if possible and allows atomic load/store for such packed structs. (See #5049).

What needs to be clarified:

  • How to handle floats (float endianess is a thing)
  • How to handle pointers in a packed struct
  • How to handle pointers to struct fields

Related issues:

Regards

  • xq
@ikskuh ikskuh added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Nov 7, 2021
@praschke
Copy link
Contributor

praschke commented Nov 7, 2021

as a weak usecase (weak because this could just be a quirk of X11), the X protocol embeds bitmasks in u16 and u32 fields whose layout is dependent on the byte-order specified by the client in the protocol setup. i don't know if the preferred representation for bitmasks is defining a bunch of 1 << n constants, or packed structs, but to express these elements with packed structs i would have to define two different sets.

it's not super terrible, but having packed structs act however shifts act on the target architecture seems more correct to me.

@ghost
Copy link

ghost commented Nov 7, 2021

The packed struct { a: u4, b: u4 } example could be a compiler bug. Endianness determines the byte order only, while the bit order within a single byte is entirely up to the compiler and can be made consistent across all platforms.

That said, it's true that endianness is a problem for packed structs. However, enforcing a single consistent endianness is not a costless action. For example, if you have

const Vec3 = packed struct {
    x: i32,
    y: i32,
    z: i32,
};

then with this proposal every field access would require byte order conversion on big-endian platforms.

Ultimately, the problem is that packed structs are used for two different purposes:

  1. As an ordinary struct, but with defined field order and no extra padding.
  2. As a bit-for-bit data serialization format.

Ideally, we would have two different struct types to match the different use cases. Maybe something like serialized struct {}. Or even struct {} vs struct(.Packed) {} vs struct(.Serialized) {}.

@ikskuh
Copy link
Contributor Author

ikskuh commented Nov 7, 2021

then with this proposal every field access would require byte order conversion on big-endian platforms.

No, it would not. y would still allocate bits 32…64 in the packed struct, which can be extracted by "bitshifting". There is no endianess encoding in that, but only bit offsets in a larger "integer". This means that reading z on a big-endian platform will read from byte offset 0, while reading z on a little-endian platfrom will read from byte-offset 8. They are not layed out consecutively in memory as it is now.

So As an ordinary struct, but with defined field order and no extra padding. will be true, but the field order will be swapped in memory for big-endian platforms.

Imho, the use cases are:

  • Reflection of hardware registers (where we need packed struct(u32))
  • Compression of in-memory structures (where the memory layout is completly irrelevant)
  • Serialization data (where one might still apply manual byte order swapping)

Only the last use case requires the use of extra work for handling endianess, but this is relevant in any case, as file formats, network protocols and such all need to define a byte order and using packed structs aren't the right tool in most cases here anyways

@SpexGuy
Copy link
Contributor

SpexGuy commented Nov 7, 2021

What happens with a very large packed struct? (Over 65536 bits). Is it still built like one giant integer, laid out in reverse field order on big endian? Also what about npot sizes (like 24 bits)? Does that round up to 32 or end at 24?

@ghost
Copy link

ghost commented Nov 7, 2021

@MasterQ32,

This means that reading z on a big-endian platform will read from byte offset 0, while reading z on a little-endian platfrom will read from byte-offset 8. They are not layed out consecutively in memory as it is now.

Sorry, I had misunderstood that part. Maybe because I wouldn't ever expect a "packed" struct to have its field order reversed on some platforms. 😄

Also, correct me if I'm wrong, but wouldn't this convention make packed structs useless for parsing binary data? Or at least you would have to always define two structs with fields A, B, C and C, B, A respectively, and then use the appropriate struct depending on platform endianness?

@ikskuh
Copy link
Contributor Author

ikskuh commented Nov 7, 2021

What happens with a very large packed struct? (Over 65536 bits). Is it still built like one giant integer, laid out in reverse field order on big endian?

The integer layout is only a mental layout, how access is implemented for non-integer backed structs isn't defined. So yeah, for huge big-endian structs, the order would still be "reverse" to allow efficient access of parts.

Also what about npot sizes (like 24 bits)? Does that round up to 32 or end at 24?

It will be the same as u24, so in a normal struct, it will take up 3 byte aligned to 4, and in a packed struct, it will take exactly 24 bit

@N00byEdge
Copy link
Contributor

N00byEdge commented Nov 7, 2021

There are many more things you have to consider when working with packed structs. When you care about the bit order, bitfields are probably used together with mmio. It has concepts of such things like access/register sizes. I'm not sure if packed structs should have all of these covered. I have written a little library for working with mmio registers which specify bit offsets of fields, while still encompassing them within a register with a certain access/register size something like this:

    command_status: extern union {
        raw: u32,

        start: bf.Boolean(u32, 0),
        recv_enable: bf.Boolean(u32, 4),
        fis_recv_running: bf.Boolean(u32, 14),
        command_list_running: bf.Boolean(u32, 15),
    },

Here each bit offset is explicitly noted, which I believe is going to be desired when you want to deal with these kinds of layouts anyways. The only thing I'm missing is some kind of way of getting rid of the repeated register type/size.

I think the goal for MMIO related structs should either be provided as a library solution like above, or at least something with a similar syntax (but hopefully without the repeated type) register struct?

@ikskuh
Copy link
Contributor Author

ikskuh commented Nov 7, 2021

There you have more issues than just the bit order. It has concepts of such things like access/registers sizes.

I know, this is covered by the integer-backed struct, that will be handled like the integer you base it on. See #5049 for a wider discussion of this topic

@N00byEdge
Copy link
Contributor

Oh, I see. Yes that encompasses exactly what I was trying to say here. Nevermind my comment.

@andrewrk andrewrk added this to the 0.10.0 milestone Nov 20, 2021
@DanB91
Copy link

DanB91 commented Nov 28, 2021

To build on this, I currently (and very carefully) use packed structs to represent bit fields in MMIO hardware registers. It'd be nice if we can actually specify the backing register size. so something like this:

const T1 = packed struct(Type) {
  a: u4, // occupies bits 0..3
  b: u4, // occupies bits 4..7
};

Where Type must be a standard unsigned type that is a byte multiple (e.g. u8, u16, etc). In this case it would be u8 And on top of that, if the sum of the bits don't equal the backing type, you get a compile error (not sure if this would be too much).

This doesn't clarify the 3 points (handling floats, etc) in the original post, so not sure if this would conflict with them. Maybe I'm asking for a completely different structure type here? Like instead of packed struct what I'm asking for should be called bitfield?

@SpexGuy
Copy link
Contributor

SpexGuy commented Nov 29, 2021

@DanB91 you might be interested in #5049 😉

@DanB91
Copy link

DanB91 commented Nov 29, 2021

@DanB91 you might be interested in #5049 😉

Oh man, this proposal is exactly what I was thinking, thanks!

@andrewrk andrewrk modified the milestones: 0.11.0, 0.10.0 Dec 23, 2021
@andrewrk andrewrk added the accepted This proposal is planned. label Dec 23, 2021
@andrewrk andrewrk changed the title A solution to the packed struct issue make packed struct always use a single backing integer, inferring it if not explicitly provided Jan 14, 2022
@topolarity
Copy link
Contributor

topolarity commented Feb 11, 2022

What's the plan of action for arrays inside a packed struct?

Unfortunately, if standard array ordering is always used, a single large @byteSwap will not be a valid endianness conversion for packed structs containing arrays, and the correspondence of array elements to the struct layout might be unexpected.

For example:

const Foo= packed struct {
    x: i32,
    y: [2]i32,
};

With standard array ordering and this proposal Foo.y[1], not Foo.y[0], is adjacent to x on big-endian systems.

@topolarity
Copy link
Contributor

topolarity commented Feb 14, 2022

A few options I can imagine would be:

  1. Prohibit arrays within packed struct entirely
  2. Order arrays within packed structs in reverse order. Use an offset-encoded pointer or similar to prevent casting to a standard-order array pointer
  3. Add a stride to arrays/pointers. Use a negative stride for reverse-order arrays
  4. Do nothing. Array ordering in a packed struct is just an inconsistency your code has to deal with.

@ghost
Copy link

ghost commented Feb 14, 2022

There is a fairly common situation where this proposal would increase awkwardness, namely if a binary data format specifies big-endian encoding for its integers (think network headers). Consider this example:

const X = packed struct {
    a: u16,
    b: u16,
    c: u32,
}

With normal field ordering, we have the following layouts:

[a1 a0 b1 b0 c3 c2 c1 c0]  // physical data layout per the standard
[a1 a0 b1 b0 c3 c2 c1 c0]  // struct X on big-endian platform
[a0 a1 b0 b1 c0 c1 c2 c3]  // struct X on little-endian platform

Thus, when parsing this format into the struct, a byte swap on the individual fields needs to be performed on an LE platform, and no action is necessary on BE.

With this proposal, the situation would change to:

[a1 a0 b1 b0 c3 c2 c1 c0]  // data format
[c3 c2 c1 c0 b1 b0 a1 a0]  // struct X on BE
[a0 a1 b0 b1 c0 c1 c2 c3]  // struct X on LE

Now both platforms need to perform a conversion. On little-endian it is still a per-field byte swap. But on big-endian we now need to reverse the order of fields without changing the fields themselves. If, on the other hand, the data format is little-endian, then no action is required on LE platforms and a whole-struct byte reversal is needed on BE. Except if the struct contains an array, as pointed out by @topolarity. Then we need to perform the big byte swap piecewise around the array(s), and also a per-element byte swap within the array, with the rules applying recursively if the elements are themselves packed structs.

Before, we had a problem with byte order. Now we have two problems.

@ikskuh
Copy link
Contributor Author

ikskuh commented Feb 14, 2022

There is a fairly common situation where this proposal would increase awkwardness, namely if a binary data format specifies big-endian encoding for its integers (think network headers)

I disagree that using packed struct is the right choice here, due to several reasons:

  • performance (accessing a u32 in a packed struct are pretty much always 4 byte reads as they can be unaligned)
  • byte order (packed structs aren't portable anyways, so you have to either perform a byte swap on the underlying integer or a byte swap on each individual field depending on the platform)
  • versioning of protocols often adds fields that might be "in between" other fields. using a struct here enforces the creation of one struct per variant which can easily grow exponentially. Using a simple res.field = if(has_field) readField() else default_value makes the code more scalable and portable

We have the guarantee that TypeInfo.Struct.fields is always in declared order, so you can easily build a serializer that reads/writes fields in order and performs byte swaps if necessary, taking no more time than reading a packed struct and performing all necessary byte swaps by hand.

@ghost
Copy link

ghost commented Feb 14, 2022

There is a fairly common situation where this proposal would increase awkwardness, namely if a binary data format specifies big-endian encoding for its integers (think network headers)

I disagree that using packed struct is the right choice here, due to several reasons

That depends. When parsing binary formats (especially convoluted legacy formats), I think it's a fairly common pattern to read a certain number of bytes corresponding to a particular substructure, cast it to a packed struct, extract the information needed to find and interpret other substructures, and repeat. You should not be obliged to use a deserialization library to do that. Slicing and dicing data by hand should feel natural in a low-level language like Zig.

@ayende
Copy link
Contributor

ayende commented Feb 14, 2022

FWIW, I'm using packed struct extensively for persistent and network data. That means that I expect to have 1:1 model from the struct definition to the memory representation.

That means that u32 in a packed struct is expected to be store in LE on LE systems and vice versa. If I need endianness support, I'll call byte swap directly myself.

@topolarity
Copy link
Contributor

topolarity commented Feb 14, 2022

Based on @zzyxyzz's point regarding serialization, it seems worth exploring what it would take to enforce a single ordering for packed fields across little- and big-endian systems.

We could define the field ordering on big-endian systems to match the little-endian field ordering.

Every packed field would need a well-defined packed representation (i.e. representation as a contiguous set of @bitSizeOf bits). The load sequence would be:

  1. Perform a sufficiently-large, unaligned load including the field's bits
  2. Slice out the packed representation (byteSwap + shift/mask)
  3. Convert to the correct integer value, if needed

This comes with costs on BE systems, but it depends on the field type:

Field Extra Load/Store Ops
Byte-aligned power-of-two integer None 1
Unaligned power-of-two integer (2x) @byteSwap
Not crossing a byte boundary None 2
Smaller than a byte, crossing boundary @byteSwap 3
Larger than a byte, NPOT Depends on packed representation

The packed representation for power-of-two integers (e.g., u8, u32, u64) would be exactly their standard representation, including byte-order based on endianness.

The hard cases are the >8-bit, NPOT integers like u15, u24, etc. A well-defined packed representation is needed for these to work with the packed-structs-as-serialization-format use case.

Footnotes

  1. Power-of-two integers already have a packed representation and require only standard byte-endian conversion. On big-endian systems, the "conversion" is included as part of the load, so no extra operations are needed.

  2. The load is just a single-byte load, no platform differences

  3. For integers smaller than a byte, their packed representation is just their serialized bytes. This requires no conversion, only slicing.

@ghost
Copy link

ghost commented Feb 14, 2022

@topolarity
I would much prefer your solution. If I understand correctly, the rule is

  1. For objects completely filling a byte-aligned region, the platform endianness applies.
  2. For objects sharing bytes with other objects, endianness is not a well-defined concept anyway, so we simply say that bits are filled from least to most significant on all platforms. On BE platforms loads and stores to such fields only will require a byte swap in addition to the mask and shift operations. But such accesses are slow anyway so the overhead should be acceptable.

I admit this is less elegant than the original proposal, but I think that leaving the fields in the declared order more than makes up for that.

@topolarity
Copy link
Contributor

topolarity commented Feb 14, 2022

That's a much simpler (better) description in the same spirit :-)

I hadn't thought to choose the packed representation based on the field alignment, but I like that solution

Maybe a less implicit rule is that any manually-aligned, byte-multiple integer (align(1) or greater) would be encoded in standard platform byte-order, while any other integer is encoded LSB to MSB (requiring at most an additional byteSwap to load/store)

@ghost
Copy link

ghost commented Feb 15, 2022

Upon reflection, the alignment-based solution is also far from ideal. Having the layout change because you're off by one bit is inelegant, not to mention a footgun. But I think we're trying to fit a round peg into a square hole here. Pretty much all the hairy problems and open questions are caused by bit granularity. Any rigorous solution I can think of severely interferes with the simplicity and usability of ordinary byte-packed structs, and vice versa. So how about this: We let packed structs be packed structs, and introduce a separate bitfield type with the semantics of the original proposal.

Packed structs

  • Fields are laid out sequentially and without padding, unless forced by an explicit alignment.
  • Fields have platform-native endianness.
  • Fractional-byte fields are not allowed, unless manually aligned/padded to whole bytes.
  • Integer-backed packed structs facilitate register mapping and atomic operations. The backing integer mostly determines the size only. In cases where the struct needs to be treated as an actual integer, it has platform-native endianness, i.e., what you'd get by casting it to a byte array and then to the backing integer type.
  • Field pointers, contained arrays, and conversions to and from byte arrays work exactly as you'd expect.
  • Intended use cases:
    • Dense in-memory representation with reasonably efficient access
    • Serialization and deserialization

Bit fields

  • Syntax: bitfield(u8) { a: u4, b: u4 }, with the backing integer being optional.
  • The bitfield is logically a single integer, with bits filled in order from least to most significant. On big-endian platforms this causes fields to be laid out backwards, and may require an end-to-end byte swap if a bitfield is used to parse binary data.
  • To keep the semantics well defined, there are some restrictions on field types:
    • Integers (signed and unsigned) are allowed, along with Booleans and integer-backed structs and enums.
    • Bitfields are allowed.
    • Arrays are allowed, but have to follow the same layout rules as everything else, including byte-reversal on BE platforms.
    • Pointers are not allowed, due to platform-dependent size.
    • Floats are not allowed, due to inconsistent representation on some platforms.
    • Structs, enums and anything else that doesn't have a fixed representation is not allowed.
  • Field pointers are not allowed unless someone works out how to handle them.
  • The intended use cases are:
    • Bit sets and flag arrays
    • Maximum-density in-memory representation with slower access
    • Serialization and parsing of bit-packed binary data
  • There are a lot of possibilities to add more bitfield-specific functionality in the future:
    • Support for logical operations, rotations, clz/ctz, etc.
    • Bit slicing, extraction, assignment.
    • Exact position specifiers (bitfield { 0..4 => a: u4, ... }).
    • And more. But this is out of scope for this proposal.

@topolarity
Copy link
Contributor

topolarity commented Feb 15, 2022

Yeah, all the existing proposals have some flaw (in terms of surprise, performance, etc.) by not allowing the user to be explicit about different representations across systems.

If we are going to expose that representational complexity to the user, I'd prefer to just be able to override default platform ordering with an "endian(X)" tag.

It would work like this:

  • If you want a packed struct that guarantees that each field is stored LSB to MSB, use endian(.Little) packed struct
  • If you want BE integers inside that struct, then label them explicitly using endian(.Big) u32. This field must completely fill a byte-aligned region.
  • If you don't care about how the packed struct is serialized, and you just want the structure to be compact w.r.t. padding, you don't need to use endian

I'm not sure whether I like this yet, but at least the user maintains strong control, along with good performance.

Note: This would also orthogonalize the current proposal versus the serialization use case. Little-endian packed structs support this use case, regardless of whatever representation is chosen for big-endian machines

@ghost
Copy link

ghost commented Feb 16, 2022

I'm also not sure I like this :). My main objection would be that defining a "normal" packed struct (i.e. with sequential layout and efficient access) gets a lot more verbose:

// before:
const S1 = packed struct {
    a: u8,
    b: u8,
    c: u16,
    d: u32,
};

// after:
const native_endian = std.Target.current.cpu.arch.endian();
const S2 = endian(.Little) packed struct {
     a: endian(native_endian) u8,
     b: endian(native_endian) u8,
     c: endian(native_endian) u16,
     d: endian(native_endian) u32,
};

It's also not quite clear to me how this will work with fractional-byte fields.

BTW, I've found some old issues (#307, #649) where something similar was proposed, including a solution for pointers to non-byte-aligned fields (e.g., &.Endian.Big :4 u4). But apparently this didn't go anywhere.

@topolarity
Copy link
Contributor

topolarity commented Feb 16, 2022

Seems that the design space is well-explored at this point. Thanks for the references :)

I wonder if your bit field proposal above could be translated into just two variants:

  • packed struct(u32) = an integer-backed packed struct
  • packed struct = a packed struct with fixed field order

The key constraint is that in a non-integer-backed packed struct any field that crosses a byte-boundary must occupy a byte-aligned region.

This permits the common use cases of <8-bit fields and byte-aligned fields (align(1) T is a valid field for all T). If you have an exotic bit-aligned use case, you can use a packed struct(uX) for that.

packed struct(u32) is byte-order affected, because it "is" a u32. Plain old packed struct is the same layout on all platforms, although its fields may have a platform-specific byte-order.

@ikskuh
Copy link
Contributor Author

ikskuh commented Feb 16, 2022

The key constraint is that in a non-integer-backed packed struct any field that crosses a byte-boundary must occupy a byte-aligned region.

Can you explain the differece of packed struct to a struct with all fields align(1) then? Or is it just syntax sugar?

const Packed = extern struct {
    a: u32 align(1),
    b: u16 align(1),
    c: u32 align(1),
    d: u8 align(1),
    e: u64 align(1),
};

Because this use case is already supported. For me packed struct is always bit-packed, and thus has to be defined differently from extern structs which are byte packed

@topolarity
Copy link
Contributor

topolarity commented Feb 16, 2022

Two differences:

  1. packed struct supports bit-packed fields which don't cross a byte boundary
  2. extern struct is intended only to match the target C ABI. It's not clear to me that manually-aligned fields should actually be permitted, except to the extent they correctly imitate alignas, which does not support under-aligning

If we ignore any targets with exotic layout rules and the question of whether the described extern struct is compatible with the C ABI (which is a lot to ignore), then it's just sugar :-)

FWIW, your question makes me think that a theoretical align(0) directive could be useful to request bit-packing even in a normal struct - I can't see any problems with that, other than the usual limitations with bit-packed fields

Edit: align(0) was proposed in #3802 and dropped in favor of this proposal

@topolarity
Copy link
Contributor

topolarity commented Feb 16, 2022

To provide a concrete example, consider a very similar-looking C11 struct:

typedef struct {
    alignas(1) uint32_t a;  // keeps natural alignment - alignas has no effect
    alignas(1) uint16_t b;  // keeps natural alignment - alignas has no effect
    alignas(1) uint32_t c;  // keeps natural alignment - alignas has no effect
    alignas(1) uint8_t d;   // keeps natural alignment - alignas has no effect
    alignas(1) uint64_t e;  // keeps natural alignment - alignas has no effect
} Packed;

This will have a different layout than your Packed struct above, since alignas has no effect if it would weaken a type's natural alignment.

You might want to add a rule that extern struct matches the target C ABI, except if any field has a manually-specified align(...). Maybe that would work, but we'd lose the ability to support structs with alignas, which is part of the C11 standard

@topolarity
Copy link
Contributor

I stand corrected:

It looks like extern struct is being considered for more than just the C ABI in #6700 . Based on the discussion there, sequential forward-aligned field offsets correctly describe struct layout for a very wide range of C ABIs. "The Lost Art of Structure Padding" concludes something similar, noting that NTP has relied on these struct layout assumptions on a wide variety of platforms.

So there may be hope that a generalized extern struct could support this use case after all 👍

alignas will need working out as a wrinkle, but that is probably the lesser of all the immediate concerns

andrewrk added a commit that referenced this issue Feb 24, 2022
This implements #10113 for the self-hosted compiler only. It removes the
ability to override alignment of packed struct fields, and removes the
ability to put pointers and arrays inside packed structs.

After this commit, nearly all the behavior tests pass for the stage2 llvm
backend that involve packed structs.

I didn't implement the compile errors or compile error tests yet. I'm
waiting until we have stage2 building itself and then I want to rework
the compile error test harness with inspiration from Vexu's arocc test
harness. At that point it should be a much nicer dev experience to work
on compile errors.
@andrewrk
Copy link
Member

This is now implemented in self-hosted, which is now the default compiler.

@fifty-six
Copy link
Contributor

Is there any alternative/planned alternative for interfacing with C code that uses packed structs? With the current change it's significantly more difficult as any packed struct with an array or any extern struct in it isn't a valid packed type. As an example, my use case is bootloader code and the UEFI spec defines things in a way no longer compatible with the new packed structs - e.g.
image

@ikskuh
Copy link
Contributor Author

ikskuh commented Sep 9, 2022

Yes. A "packed struct" in C is equivalent to an extern struct in zig with all fields having align(1). It's planned to have some kind of syntax like extern struct align(1) { … } which would be equivalent to the C version

@tleydxdy
Copy link

how about
packed struct(uX) for treating the whole struct as integer (on some platform the field might have reverse in memory order)
packed struct(.bit) where fields are always stored in order in memory and packed bit by bit, performance for field access is out the window but the bit order is most consistent
packed struct(.byte) where fields are always stored in order in memory, each field is padded to a multiple of 8bit, access might be not aligned but at least there's not a ton of shifts.
for ease of use maybe something like C's unnamed struct member might be used so nesting packed struct(.bit) inside packed struct(.byte) is easy and can still be accessed by foo.bar

@ghost
Copy link

ghost commented Oct 21, 2022

xref #12852

@raddad772
Copy link

raddad772 commented Jun 25, 2024

There is no magic bullet for byte order. It's painful and will always be painful. It requires too much context on the internal data and there's no simplistic and good way of representing it all, unlike what some proposals are trying to do. Not being able to use a simple packed array inside a struct is a pretty huge feature missing.

Here's how you deal with byte order:

  1. Cry
  2. Suffer
  3. Code the program to deal with it

Byteswap requires impossible context. If someone tries to byteswap a struct with a packed array in it, just give an error! Introduce a second, ByteSwapAllExceptPackedStructArray() kinda function, that does as it says, for packed structs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted This proposal is planned. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests