Make interned's last_interned_at equal Revision::MAX if they are interned outside a quer #804

ChayimFriedman2 · 2025-04-20T20:59:14Z

There is an assert that last_interned_at >= last_changed_revision, and it can fail without this, see the added test.

CC @ibraheemdev, you introduced this assert in #602.

netlify · 2025-04-20T20:59:35Z

✅ Deploy Preview for salsa-rs canceled.

Name	Link
🔨 Latest commit	`97a04e2`
🔍 Latest deploy log	https://app.netlify.com/sites/salsa-rs/deploys/6807456c55e98100083c35fd

codspeed-hq · 2025-04-20T21:00:29Z

CodSpeed Performance Report

Merging #804 will degrade performances by 5.79%

_{Comparing ChayimFriedman2:outside-intern (97a04e2) with master (4ab2d27)}

Summary

❌ 1 (👁 1) regressions
✅ 11 untouched benchmarks

Benchmarks breakdown

	Benchmark	`BASE`	`HEAD`	Change
👁	`amortized[SupertypeInput]`	3.8 µs	4 µs	-5.79%

ChayimFriedman2 · 2025-04-20T21:06:18Z

Nonsense benchmark.

ibraheemdev · 2025-04-20T21:52:18Z

I think we discussed this before, but couldn't come up with a test case that would make it fail due to the db lifetime. The idea was to set last_interned_at to Revision::MAX for structs that are interned outside of revisions. Would that work for the case you are describing?

ChayimFriedman2 · 2025-04-20T21:57:14Z

I think we discussed this before, but couldn't come up with a test case that would make it fail due to the db lifetime

Well rust-analyzer needs interned(no_lifetime) at least currently.

The idea was to set last_interned_at to Revision::MAX for structs that are interned outside of revisions. Would that work for the case you are describing?

I think yes. I thought about that, but removing the assert seemed easier than starting to mess with the query stack in the struct interning. Is there a specific reason you want to keep this assert?

MichaReiser · 2025-04-21T10:12:37Z

I'd prefer to set the revision to Revision::MAX (the asserted condition is probably going to be important when implementing garbage collection of interned values) and we already look at the active query on line 223, so it shouldn't increase complexity by a lot.

ChayimFriedman2 · 2025-04-21T17:46:28Z

@ibraheemdev I edited per your suggestion.

src/revision.rs

MichaReiser · 2025-04-22T06:03:44Z

src/interned.rs

+        if value.last_interned_at.load() < current_revision {
+            value.last_interned_at.store(current_revision);
+        }



You could use fetch_max here to store the maximum between the current revision and the last_interned_at to avoid two separate atomic operations.

This is AtomicRevision, not AtomicUsize. I will need to define fetch_max(), and it's not worth it. It's not like a race condition is problematic here.

AtomicRevision is just a small wrapper around AtomicUsize. You can see OptionalAtomicRevision how we exposed other atomic methods.

This isn't just about races, it's also about avoiding unnecessary atomic operations in a very hot method

fetch_max() won't be any faster; it needs to be an atomic RMW. Even on x86, it compiles to a cmpxchg loop, compared to load+store that compiles to normal instructions.

Although we can be faster via branchless; I will change to that.

ChayimFriedman2 · 2025-04-22T07:09:09Z

@MichaReiser Addressed comments.

…interned outside a query There is an assert that `last_interned_at >= last_changed_revision`, and it can fail without this, see the added test.

MichaReiser · 2025-04-22T11:21:21Z

src/interned.rs

+        value.last_interned_at.store(std::cmp::max(
+            current_revision,
+            value.last_interned_at.load(),
+        ));


Hmm, that was not the idea. The idea was to use AtomicUsize::fetch_max to combine the load and store instructions

Something like

value.last_interned_at.fetch_max(current_revision, Ordering::XXX)

where AtomicRevision::fetch_max internally calls fetch_max

Would you mind making this change in a follow up PR?

Interestngly enough it seems the fetch_max version is worse? https://godbolt.org/z/9efcq7cnh

Interesting. It sort of make sense because both operations now are atomic. It'd be interesting to see if arm64 produces more efficient instructions

When these operations affect more than one bit, they cannot be represented by a single x86-64 instruction. Similarly, the fetch_max and fetch_min operations also have no corresponding x86-64 instruction. For these operations, we need a different strategy than a simple lock prefix.

A later version of ARM64, part of ARMv8.1, also includes new CISC style instructions for common atomic operations. For example, the new ldadd (load and add) instruction is equivalent to an atomic fetch_add operation, without the need for an LL/SC loop. It even includes instructions for operations like fetch_max, which don’t exist on x86-64.

It also includes a cas (compare and swap) instruction corresponding to com⁠pare_exchange. When this instruction is used, there’s no difference between compare_exchange and compare_exchange_weak, just like on x86-64.

While the LL/SC pattern is quite flexible and nicely fits the general RISC pattern, these new instructions can be more performant, as they can be easier to optimize for with specialized hardware.

https://marabos.nl/atomics/hardware.html

fetch_max should be more efficient on ARM64.

That's exactly what I said:

fetch_max() won't be any faster; it needs to be an atomic RMW. Even on x86, it compiles to a cmpxchg loop, compared to load+store that compiles to normal instructions.

And ARM is the same in this regard.

Generally RMW operations are expensive compared to regular (non-seqcst) load/stores. On x86 these will compile to regular (same as non-atomic) load/store instructions, while RMWs entail a strong barrier (a pipeline stall). If the branch can avoid performing a store the load may be worth it (as a contended store is much more expensive than a branch/load), but I would stay away from the RMW.

ChayimFriedman2 force-pushed the outside-intern branch from fea2e4b to fafa4ca Compare April 20, 2025 21:05

ChayimFriedman2 mentioned this pull request Apr 20, 2025

Wrong lifetime variance in hover for recursive types rust-lang/rust-analyzer#19455

Open

ChayimFriedman2 force-pushed the outside-intern branch 2 times, most recently from 6629b7c to d97ed2c Compare April 21, 2025 17:45

ChayimFriedman2 changed the title ~~fix: Remove the assert for interned structs that last_interned_at >= last_changed_revision~~ Make interned's last_interned_at equal Revision::MAX if they are interned outside a quer Apr 21, 2025

ChayimFriedman2 force-pushed the outside-intern branch from d97ed2c to 88c7b9d Compare April 21, 2025 17:49

MichaReiser reviewed Apr 22, 2025

View reviewed changes

MichaReiser approved these changes Apr 22, 2025

View reviewed changes

ChayimFriedman2 force-pushed the outside-intern branch from 88c7b9d to 22f0dc2 Compare April 22, 2025 07:07

Make interned's last_interned_at equal Revision::MAX if they are …

97a04e2

…interned outside a query There is an assert that `last_interned_at >= last_changed_revision`, and it can fail without this, see the added test.

ChayimFriedman2 force-pushed the outside-intern branch from 22f0dc2 to 97a04e2 Compare April 22, 2025 07:29

Veykril added this pull request to the merge queue Apr 22, 2025

Merged via the queue into salsa-rs:master with commit cf9efae Apr 22, 2025
11 checks passed

github-actions bot mentioned this pull request Apr 22, 2025

chore: release v0.20.0 #753

Merged

ChayimFriedman2 deleted the outside-intern branch April 22, 2025 10:52

MichaReiser reviewed Apr 22, 2025

View reviewed changes

MichaReiser mentioned this pull request May 8, 2025

Lazy finalization of cycle participants in maybe_changed_after #854

Merged

Make interned's last_interned_at equal Revision::MAX if they are interned outside a quer #804

Make interned's last_interned_at equal Revision::MAX if they are interned outside a quer #804

Uh oh!

Conversation

ChayimFriedman2 commented Apr 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

netlify bot commented Apr 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for salsa-rs canceled.

Uh oh!

codspeed-hq bot commented Apr 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging #804 will degrade performances by 5.79%

Summary

Benchmarks breakdown

Uh oh!

ChayimFriedman2 commented Apr 20, 2025

Uh oh!

ibraheemdev commented Apr 20, 2025

Uh oh!

ChayimFriedman2 commented Apr 20, 2025

Uh oh!

MichaReiser commented Apr 21, 2025

Uh oh!

ChayimFriedman2 commented Apr 21, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MichaReiser Apr 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ChayimFriedman2 commented Apr 22, 2025

Uh oh!

Uh oh!

MichaReiser Apr 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MichaReiser Apr 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ChayimFriedman2 Apr 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ChayimFriedman2 commented Apr 20, 2025 •

edited

Loading

netlify bot commented Apr 20, 2025 •

edited

Loading

codspeed-hq bot commented Apr 20, 2025 •

edited

Loading

MichaReiser Apr 22, 2025 •

edited

Loading

MichaReiser Apr 22, 2025 •

edited

Loading

MichaReiser Apr 22, 2025 •

edited

Loading

ChayimFriedman2 Apr 22, 2025 •

edited

Loading