Make `Rc<T>::deref` zero-cost #141348

EFanZh · 2025-05-21T14:14:40Z

This PR makes Rc::deref zero-cost by changing the internal pointer to point directly to the value instead of to the allocation.

This PR is split from #132553, which will also make Arc::deref zero-cost.

rustbot · 2025-05-24T10:22:23Z

The Miri subtree was changed

cc @rust-lang/miri

library/alloc/src/raw_rc/mod.rs

oli-obk · 2025-05-26T15:03:21Z

@bors try @rust-timer queue

bors · 2025-05-26T15:04:31Z

⌛ Trying commit f5245ba with merge 8ef4a25...

Make `Rc<T>::deref` zero-cost This PR makes `Rc::deref` zero-cost by changing the internal pointer so that it points to the value directly instead of the allocation. This is split out from #132553, which will also make `Arc::deref` zero-cost.

bors · 2025-05-26T17:13:37Z

☀️ Try build successful - checks-actions
Build commit: 8ef4a25 (8ef4a25b05973cfbd577205c507a891d07f0ae5f)

rust-timer · 2025-05-26T19:26:48Z

Finished benchmarking commit (8ef4a25): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

	mean	range	count
Regressions ❌ (primary)	0.5%	[0.2%, 0.7%]	4
Regressions ❌ (secondary)	1.1%	[0.2%, 2.5%]	9
Improvements ✅ (primary)	-1.0%	[-2.0%, -0.5%]	7
Improvements ✅ (secondary)	-1.3%	[-2.0%, -0.1%]	4
All ❌✅ (primary)	-0.5%	[-2.0%, 0.7%]	11

Max RSS (memory usage)

Results (primary -0.3%, secondary -0.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	8.4%	[8.4%, 8.4%]	1
Regressions ❌ (secondary)	3.1%	[2.0%, 4.8%]	6
Improvements ✅ (primary)	-4.6%	[-7.1%, -2.1%]	2
Improvements ✅ (secondary)	-4.7%	[-6.4%, -2.9%]	4
All ❌✅ (primary)	-0.3%	[-7.1%, 8.4%]	3

Cycles

Results (primary -0.5%, secondary -1.9%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.9%	[0.9%, 0.9%]	1
Regressions ❌ (secondary)	2.1%	[2.1%, 2.1%]	2
Improvements ✅ (primary)	-1.9%	[-1.9%, -1.9%]	1
Improvements ✅ (secondary)	-3.9%	[-9.3%, -1.5%]	4
All ❌✅ (primary)	-0.5%	[-1.9%, 0.9%]	2

Binary size

Results (primary 0.2%, secondary 1.6%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.3%	[0.0%, 1.8%]	29
Regressions ❌ (secondary)	1.8%	[0.0%, 6.4%]	9
Improvements ✅ (primary)	-0.3%	[-0.6%, -0.1%]	5
Improvements ✅ (secondary)	-0.4%	[-0.4%, -0.4%]	1
All ❌✅ (primary)	0.2%	[-0.6%, 1.8%]	34

Bootstrap: 775.728s -> 776.531s (0.10%)
Artifact size: 366.25 MiB -> 366.33 MiB (0.02%)

apiraino · 2025-07-31T13:34:22Z

Would anyone comment the perf. run? cc @EFanZh thanks!

@rustbot author

rustbot · 2025-07-31T13:34:26Z

Reminder, once the PR becomes ready for a review, use @rustbot ready.

library/alloc/src/raw_rc/mod.rs

EFanZh · 2025-08-07T11:13:10Z

@apiraino: In my opinion, a proper perf run should be done on #132553. This PR only migrates Rc, while the other one also migrates Arc, allowing some codes to be shared between these types. That could have a impact on performance.

tgross35

Some first-pass comments about the basic structure before looking in depth at the rest.

As a note, I'm having the same problem Jonas and Scott were: this is >4k total changed lines of mission-critical code in a single diff. I'll get through it one way or another, but this is the kind of situation where having >1 commit in the PR would make a huge difference in reviewability. That also lets you build up the design with a commit description for each part of the change, explaining the "why"/"how" each step is done (that context is missing here).

E.g. this could be split into orthogonal patches like:

Add basic traits
Introduce RawRc and its most basic methods
Introduce RawWeak and its most basic methods
Introduce RcLayout and its most basic methods
Introduce RawUniqueRc and its most basic methods

Somewhere around here, the basic brand new part should "work", at least conceptually (possibly with a bunch of todo!()s).

More complicated RawRc and RawWeak methods and traits, split further for anything that makes sense to get its own description
Replace the Rc implementation
Test changes and updates
Debuginfo changes & test updates

(If you're open to this but need workflow advice, happy to help)

tgross35 · 2025-08-13T08:27:15Z

library/alloc/src/raw_rc/mod.rs

+/// Defines the `RefCounts` struct to store reference counts. The reference counters have suitable
+/// alignment for atomic operations.
+macro_rules! define_ref_counts {
+    ($($target_pointer_width:literal => $align:literal,)*) => {
+        $(
+            /// Stores reference counts.
+            #[cfg(target_pointer_width = $target_pointer_width)]
+            #[repr(C, align($align))]
+            pub(crate) struct RefCounts {
+                /// Weak reference count (plus one if there are non-zero strong reference counts).
+                pub(crate) weak: UnsafeCell<usize>,
+                /// Strong reference count.
+                pub(crate) strong: UnsafeCell<usize>,
+            }
+        )*
+    };
+}
+
+// This ensures reference counters have correct alignment so that they can be treated as atomic
+// reference counters for `Arc`.
+define_ref_counts! {
+    "16" => 2,
+    "32" => 4,
+    "64" => 8,
+}


This doesn't really need a macro:

#[repr(C)] #[cfg_attr(target_pointer_width = "16", align(16))] #[cfg_attr(target_pointer_width = "32", align(32))] #[cfg_attr(target_pointer_width = "64", align(64))] pub(crate) struct RefCounts { // ... }

(we need align(usize))

Do you mean this?

#[cfg_attr(target_pointer_width = "16", repr(C, align(2)))] #[cfg_attr(target_pointer_width = "32", repr(C, align(4)))] #[cfg_attr(target_pointer_width = "64", repr(C, align(8)))] pub(crate) struct RefCounts { // ... }

There doesn’t seem to be an align attribute。

tgross35 · 2025-08-13T08:33:09Z

library/alloc/src/raw_rc/mod.rs

+pub(crate) unsafe trait RcOps: Sized {
+    /// Increments a reference counter managed by `RawRc` and `RawWeak`. Currently, both strong and
+    /// weak reference counters are incremented by this method.
+    ///
+    /// # Safety
+    ///
+    /// - `count` should only be handled by the same `RcOps` implementation.
+    /// - The value of `count` should be non-zero.
+    unsafe fn increment_ref_count(count: &UnsafeCell<usize>);


Would something like this work?

unsafe trait RcOps: Sized { type Count; fn count_from_unsafe_cell(&UnsafeCell<usize>) -> &Self::Count; fn increment_ref_count(count: &Self::Count); }

Arc defines Count as Atomic<usize> and Rc as Cell<usize>. Then the unsafety of working with UnsafeCell is contained to that single function, and the rest of the functions here can take &Self:::Count.

Actually, is UnsafeCell needed at all? If this trait were implemented on the counter (rather than a dummy type) then I think that can be eliminated:

trait RcCounter { fn from_usize(usize) -> Self; fn increment_count(&self); fn decrement_count(&self); fn is_unique(counts: &RefCounts<Self>); // ... } struct RefCounts<Count: RcCounter> { weak: Count, strong: Count, }

I can see that would be a problem for the const constructor. For that specific case, It is probably reasonable to make RcCounter an unsafe trait, with the safety requirement that it must be transmutable from UnsafeCell<usize>, then do a transmute in RefCounts::new.

Actually, is UnsafeCell needed at all? If this trait were implemented on the counter (rather than a dummy type) then I think that can be eliminated:

Usage of UnsafeCell is to avoid different counter types for Rc and Arc, so that they can share some codes that operates on the counter types. If I use different counter types for Rc and Arc, some functions needs to be changed to generic functions which involves monomorphization overheads, I have to reason about the overhead is negligible or can be optimized away, so I decided that using a single UnsafeCell type is the safer choice.

Was this done in response to perf results? I'm thinking that at least the first option I mentioned (type Count) wouldn't be any different since the relevant parts are type-specific anyway.

Even with the second method (RefCounts generic over Count) I wouldn't expect there to be much of a difference: the functions are either generic anyway, or #[inline], or trivially inlineable so it's probably happening anyway. Most of the parts that are large enough to share code are the allocation methods, which don't use RefCounts anyway. We could handle specific cases where it's helpful by turning both versions into a RefCounts<UnsafeCell>. (And I wouldn't expect any of this to be worse than the current impl; though of course perf could show differently)

My concern here is just that it's more difficult to follow the soundness invariants through the code compared to the current version.

tgross35 · 2025-08-13T08:48:21Z

library/alloc/src/raw_rc/mod.rs

+    /// - `count` should only be handled by the same `RcOps` implementation.
+    /// - The value of `count` should be non-zero.


Safety docs should probably be using "must" rather than "should"; as written, it sounds optional.

tgross35 · 2025-08-13T09:53:47Z

tests/codegen-llvm/array-of-dangling-weak-uses-memset.rs

@@ -0,0 +1,23 @@
+//@ compile-flags: -Z merge-functions=disabled


It's reasonably obvious from the name, but could you add a comment saying what the test does? We're trying to get a bit better with that. There's also the codegen-llvm/lib-optimizations directory, which might be a good fit for these tests.

(Same for the other test)

EFanZh · 2025-08-13T12:52:00Z

@tgross35: I can further split my commit, but I can only do this in my free time, so it will take some time for me to finish it. Also, I think it’s better for me to investigate the perf result on #132553 first, and do some optimizations, then I can do the splitting base on a known-to-be-good commit.

Also, can I have your opinion on the overall design of introducing RawRc and RawWeak type as the base types for Rc and Arc? If there needs to be a significant structural change, I want to do it before the commit splitting.

tgross35 · 2025-08-13T17:42:18Z

I love the approach and have wanted to do something similar for a while! The UnsafeCell part is my only concern since it loses a lot of encapsulation that we currently have - of course it works, but if we can use a known counter type then that removes a lot of new unsafe.

tgross35 · 2025-08-13T18:16:57Z

library/alloc/src/raw_rc/mod.rs

+/// Allocates uninitialized memory for a reference-counted allocation with allocator `alloc` and
+/// layout `RcLayout`. Returns a pointer to the value location.
+#[inline]
+fn allocate_uninit_raw_bytes<A>(alloc: &A, rc_layout: RcLayout) -> Result<NonNull<()>, AllocError>


There are a lot of "Returns a pointer to the value location" in the docs, and functions that expect a pointer to the value. What about a new struct RcDataPointer(NonNull<()>) to provide some type safety about what's being pointed to?

The -> NonNull<RefCounts> method would just be a method on this

rustbot assigned joboet May 21, 2025

rustbot added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels May 21, 2025

This comment has been minimized.

Sign in to view

EFanZh force-pushed the zero-cost-rc-deref branch from df34f84 to d3a7429 Compare May 24, 2025 05:01

This comment has been minimized.

Sign in to view

EFanZh force-pushed the zero-cost-rc-deref branch 2 times, most recently from bc84ec6 to 19fb34b Compare May 24, 2025 09:00

EFanZh marked this pull request as ready for review May 24, 2025 10:22

rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 24, 2025

EFanZh mentioned this pull request May 24, 2025

Make Rc<T>::deref and Arc<T>::deref zero-cost #132553

Open

teor2345 reviewed May 26, 2025

View reviewed changes

library/alloc/src/raw_rc/mod.rs Outdated Show resolved Hide resolved

EFanZh force-pushed the zero-cost-rc-deref branch from 19fb34b to f5245ba Compare May 26, 2025 15:02

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 26, 2025

bors mentioned this pull request May 26, 2025

Remove an unnecessary use of Box::into_inner. #141599

Merged

This comment has been minimized.

Sign in to view

rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels May 26, 2025

rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 31, 2025

teor2345 reviewed Jul 31, 2025

View reviewed changes

library/alloc/src/raw_rc/mod.rs Outdated Show resolved Hide resolved

Make Rc<T>::deref zero-cost

e345c2f

EFanZh force-pushed the zero-cost-rc-deref branch from f5245ba to e345c2f Compare August 10, 2025 01:37

tgross35 self-assigned this Aug 10, 2025

tgross35 reviewed Aug 13, 2025

View reviewed changes

		/// - `count` should only be handled by the same `RcOps` implementation.
		/// - The value of `count` should be non-zero.

		@@ -0,0 +1,23 @@
		//@ compile-flags: -Z merge-functions=disabled

Make Rc<T>::deref zero-cost #141348

Are you sure you want to change the base?

Make Rc<T>::deref zero-cost #141348

Conversation

EFanZh commented May 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment has been minimized.

This comment has been minimized.

rustbot commented May 24, 2025

Uh oh!

Uh oh!

oli-obk commented May 26, 2025

Uh oh!

This comment has been minimized.

bors commented May 26, 2025

Uh oh!

bors commented May 26, 2025

Uh oh!

This comment has been minimized.

rust-timer commented May 26, 2025

Overall result: ❌✅ regressions and improvements - please read the text below

Uh oh!

apiraino commented Jul 31, 2025

Uh oh!

rustbot commented Jul 31, 2025

Uh oh!

Uh oh!

EFanZh commented Aug 7, 2025

Uh oh!

tgross35 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

EFanZh Aug 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tgross35 Aug 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tgross35 Aug 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tgross35 Aug 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

EFanZh commented Aug 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tgross35 commented Aug 13, 2025

Uh oh!

tgross35 Aug 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Make `Rc<T>::deref` zero-cost #141348

Make `Rc<T>::deref` zero-cost #141348

EFanZh commented May 21, 2025 •

edited

Loading

tgross35 left a comment •

edited

Loading

EFanZh Aug 13, 2025 •

edited

Loading

tgross35 Aug 13, 2025 •

edited

Loading

tgross35 Aug 13, 2025 •

edited

Loading

tgross35 Aug 13, 2025 •

edited

Loading

EFanZh commented Aug 13, 2025 •

edited

Loading

tgross35 Aug 13, 2025 •

edited

Loading