Skip to content

Conversation

cbiffle
Copy link
Collaborator

@cbiffle cbiffle commented Jul 22, 2025

To simplify its implementation, ringbuf currently requires every ringbuf to be declared with an initialization element. It creates a static array containing that element.

If that element is represented as zero, that goes in BSS and is zero-initialized with the rest of BSS.

If that element is represented as anything other than zero, the compiler creates a full-sized "initialization image" copy of the ringbuf in flash that it can then memcpy into RAM on startup.

Using the default Rust enum representation, the easiest way to get a ringbuf initialized to non-zero bytes is to put its initialization sentinel value (e.g. None) as the last variant in the enum. Unfortunately, this is super common.

This commit moves all the ones I could find, which should reduce flash usage in a bunch of cases.

@cbiffle cbiffle requested a review from hawkw July 22, 2025 19:30
Copy link
Member

@hawkw hawkw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

For what it's worth, [if the enum has a repr attribute, you can also explicitly set the discriminant for that variant to 0][1]:

#[derive(Copy, Clone, PartialEq)]
#[repr(u8)]
enum Trace {
    None = 0,
    Whatever(u64)
}

or similar. IMO this might be worth considering since it feels slightly less like Hoping The Compiler Does The Right Thing Because We Put Them In Order, but 🤷‍♀️.

It might be worth adding a note about this in the ringbuf docs, so that folks adding new ringbufs in the future don't mess it up?

@cbiffle
Copy link
Collaborator Author

cbiffle commented Jul 22, 2025

Hmmmm. Doesn't the repr attribute affect niche optimization? I'm concerned it might inflate the ringbuf statics.

@hawkw
Copy link
Member

hawkw commented Jul 22, 2025

Hmmmm. Doesn't the repr attribute affect niche optimization? I'm concerned it might inflate the ringbuf statics.

Mmm, it might --- the current thing is fine, we can look into whether the explicit discriminant is worth doing separately.

@aapoalas
Copy link
Contributor

Hmmmm. Doesn't the repr attribute affect niche optimization? I'm concerned it might inflate the ringbuf statics.

It does not, assuming you don't have enums like this:

#[repr(u8)]
enum A {
  None,
  Some(B),
}

#[repr(u8)]
enum B {
  X,
  Y,
}

I think here the repr enum layout might guarantee that X is 0 and Y is 1, which then means that the compiler cannot "flatten" A to have values 0/1/2.

I was also going to suggest considering repr enums: it would be a little easier to look for "mistakes" like this by searching for " None," in the codebase.

@cbiffle
Copy link
Collaborator Author

cbiffle commented Jul 22, 2025

I think if I'm going to put much more thought into it, I'd rather eliminate the need for a sentinel value and default ringbufs to uninit. Having a None in an enum that isn't Option kind of bothers me, plus you can currently write ringbuf_entry!(Trace::None); and that produces silly results.

@cbiffle cbiffle force-pushed the cbiffle/zero-ringbuf branch from f301ea0 to 1b13a51 Compare July 22, 2025 19:58
@hawkw
Copy link
Member

hawkw commented Jul 22, 2025

I think if I'm going to put much more thought into it, I'd rather eliminate the need for a sentinel value and default ringbufs to uninit. Having a None in an enum that isn't Option kind of bothers me, plus you can currently write ringbuf_entry!(Trace::None); and that produces silly results.

Yup, agreed , making them uninitialized would be a nicer solution. Let's do that at some point.

@aapoalas
Copy link
Contributor

I think if I'm going to put much more thought into it, I'd rather eliminate the need for a sentinel value and default ringbufs to uninit. Having a None in an enum that isn't Option kind of bothers me, plus you can currently write ringbuf_entry!(Trace::None); and that produces silly results.

That definitely sounds like the correct way to solve the whole problem, and not even too hard to do. Out of curiosity though, would you have a temporary counter for unset values to avoid reading the uninitialised data? Or would you somehow try to force ringbuf enums to not use the 0 value, so the bss 0 initial values could be read as uninitiated dynamically?

@cbiffle
Copy link
Collaborator Author

cbiffle commented Jul 22, 2025

would you have a temporary counter for unset values

@aapoalas there are a couple options -- keeping a size that saturates at the size of the table would do it, for example. That'd cost one more word of memory, but it's probably worth it.

There are alternatives that could let you pack this information into the existing "position" counter used to track the position in the circular buffer, but they'd save at most one word, and are potentially fragile. (Example: use position==0 to indicate empty, store all real positions as x+1)

@cbiffle cbiffle force-pushed the cbiffle/zero-ringbuf branch from 1b13a51 to 0083ba6 Compare July 22, 2025 21:16
@cbiffle cbiffle enabled auto-merge (rebase) July 22, 2025 21:17
@aapoalas
Copy link
Contributor

would you have a temporary counter for unset values

@aapoalas there are a couple options -- keeping a size that saturates at the size of the table would do it, for example. That'd cost one more word of memory, but it's probably worth it.

There are alternatives that could let you pack this information into the existing "position" counter used to track the position in the circular buffer, but they'd save at most one word, and are potentially fragile. (Example: use position==0 to indicate empty, store all real positions as x+1)

Thank you for the answer, much appreciated <3

I was actually just going to edit my comment as I was looking at the internals of ringbuf and realised I hadn't understood that there's metadata on each entry as well. That kind of changes the equation a little: one alternative would then be to use an Option<NonZeroUN> as the generation value. That'd avoid the word cost of the saturating counter in exchange for... I guess mildly more complicated code?

@cbiffle cbiffle disabled auto-merge July 22, 2025 21:22
To simplify its implementation, ringbuf currently requires every ringbuf
to be declared with an initialization element. It creates a static array
containing that element.

If that element is represented as zero, that goes in BSS and is
zero-initialized with the rest of BSS.

If that element is represented as anything other than zero, the compiler
creates a full-sized "initialization image" copy of the ringbuf _in
flash_ that it can then memcpy into RAM on startup.

Using the default Rust enum representation, the easiest way to get a
ringbuf initialized to non-zero bytes is to put its initialization
sentinel value (e.g. None) as the last variant in the enum.
Unfortunately, this is super common.

This commit moves all the ones I could find, which should reduce flash
usage in a bunch of cases. A survey of examples in the Cosmo build
(note: many of these savings are not fully realized due to pow2
alignment restrictions):

- Cosmo Seq: -6328
- Spartan7 Loader: -2712
- I2C: 2488 bytes
- Every SPI driver: -1068 (so, -2136 total)
- Hiffy: -1564
- Sprot: -1076
- Update: -1036
- Validate: -1036
- Power: -559
- Thermal: -212
- Dump agent: -12 (...?)
@cbiffle cbiffle force-pushed the cbiffle/zero-ringbuf branch from 0083ba6 to 88872f5 Compare July 22, 2025 21:26
@cbiffle cbiffle merged commit 3211ce0 into master Jul 22, 2025
135 checks passed
@cbiffle cbiffle deleted the cbiffle/zero-ringbuf branch July 22, 2025 21:51
hawkw added a commit that referenced this pull request Aug 13, 2025
Currently, Hubris' Rust toolchain is pinned to `nightly-2024-09-17`,
which is from 10 months ago. This is kind of unfortunate, especially
because that nightly is too old to support the Rust 2024 edition, which
means we cannot update our dependencies on any crates where the latest
version is built with Rust 2024.

Beyond just updating the toolchain, there were some noteworthy changes:

- Naked functions are now stable (yay!), but the attribute changed to
  `#[unsafe(naked)]`. Inline assembly in naked functions must now use
  `core::arch::naked_asm!` rather than normal `asm!`. As far as I can
  tell, the only difference between this and regular `asm!` is that it
  does not support `options(noreturn)`, as I believe the `naked_asm!`
  block kind of implies at least some of the codegen differences for
  `noreturn`.
- New warnings on creation of temporary shared references to mutable
  statics showed up in `stm32h7-update-server`, where we were using
  `[T]::as_ptr()` on zero-sized slices in mutable statics that were used
  to get linker-generated addresses. `[T]::as_ptr()` takes an `&self`
  receiver, so this was generating a temporary shared reference to the
  mutable static. I changed this to use `&raw const`, which takes the
  address of the static without creating a shared reference.
- There was a substantial regression in flash and RAM usage on the new
  toolchain due to [a change in the behavior of the `#[used]`
  attribute][1] which revealed [an underlying issue where ringbufs were
  not zero initialized][2]. These issues were resolved separately in
  #2167, #2168, #2170, and oxidecomputer/idolatry#65.

In addition, there were a variety of unremarkable linting changes,
including slightly better dead code detection (which detected some new
dead code), and some annoying clippy nonsense.

Note that this branch requires oxidecomputer/idolatry#65, which updates
`idol` to generate code that doesn't emit warnings with the new
toolchain, and fixes some of the flash/RAM size regression in code
generated by `idol`.

Fixes #2165

[1]: #2165 (comment)
[2]: #2165 (comment)
rusty1968 pushed a commit to rusty1968/hubris that referenced this pull request Sep 17, 2025
Currently, Hubris' Rust toolchain is pinned to `nightly-2024-09-17`,
which is from 10 months ago. This is kind of unfortunate, especially
because that nightly is too old to support the Rust 2024 edition, which
means we cannot update our dependencies on any crates where the latest
version is built with Rust 2024.

Beyond just updating the toolchain, there were some noteworthy changes:

- Naked functions are now stable (yay!), but the attribute changed to
  `#[unsafe(naked)]`. Inline assembly in naked functions must now use
  `core::arch::naked_asm!` rather than normal `asm!`. As far as I can
  tell, the only difference between this and regular `asm!` is that it
  does not support `options(noreturn)`, as I believe the `naked_asm!`
  block kind of implies at least some of the codegen differences for
  `noreturn`.
- New warnings on creation of temporary shared references to mutable
  statics showed up in `stm32h7-update-server`, where we were using
  `[T]::as_ptr()` on zero-sized slices in mutable statics that were used
  to get linker-generated addresses. `[T]::as_ptr()` takes an `&self`
  receiver, so this was generating a temporary shared reference to the
  mutable static. I changed this to use `&raw const`, which takes the
  address of the static without creating a shared reference.
- There was a substantial regression in flash and RAM usage on the new
  toolchain due to [a change in the behavior of the `#[used]`
  attribute][1] which revealed [an underlying issue where ringbufs were
  not zero initialized][2]. These issues were resolved separately in
  oxidecomputer#2167, oxidecomputer#2168, oxidecomputer#2170, and oxidecomputer/idolatry#65.

In addition, there were a variety of unremarkable linting changes,
including slightly better dead code detection (which detected some new
dead code), and some annoying clippy nonsense.

Note that this branch requires oxidecomputer/idolatry#65, which updates
`idol` to generate code that doesn't emit warnings with the new
toolchain, and fixes some of the flash/RAM size regression in code
generated by `idol`.

Fixes oxidecomputer#2165

[1]: oxidecomputer#2165 (comment)
[2]: oxidecomputer#2165 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants