-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Avoid rehashing Fingerprint as a map key #76233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
use std::collections::{HashMap, HashSet}; | ||
use std::hash::{BuildHasherDefault, Hasher}; | ||
|
||
pub type UnhashMap<K, V> = HashMap<K, V, BuildHasherDefault<Unhasher>>; | ||
pub type UnhashSet<V> = HashSet<V, BuildHasherDefault<Unhasher>>; | ||
|
||
/// This no-op hasher expects only a single `write_u64` call. It's intended for | ||
/// map keys that already have hash-like quality, like `Fingerprint`. | ||
#[derive(Default)] | ||
pub struct Unhasher { | ||
value: u64, | ||
} | ||
|
||
impl Hasher for Unhasher { | ||
#[inline] | ||
fn finish(&self) -> u64 { | ||
self.value | ||
} | ||
|
||
fn write(&mut self, _bytes: &[u8]) { | ||
unimplemented!("use write_u64"); | ||
} | ||
|
||
#[inline] | ||
fn write_u64(&mut self, value: u64) { | ||
debug_assert_eq!(0, self.value, "Unhasher doesn't mix values!"); | ||
self.value = value; | ||
} | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you try a panic! here instead to try and track down all the cases where we are hashing fingerprints?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hadn't, but I tried just now and it ran into a
StableHasher
for the parentDefPathHash
:rust/compiler/rustc_hir/src/definitions.rs
Lines 111 to 117 in 130359c
I suppose we could have that hash without the parent at first, and then
Fingerprint::combine
them for the final value. I'll give that a shot and see if anything else comes up.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The next one that comes up is
item_ids_hash.hash_stable(...)
:rust/compiler/rustc_middle/src/ich/impls_hir.rs
Lines 52 to 65 in 130359c
I wonder if we could instead add a combine operation directly in the
StableHasher
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, we could specialize to allow
Fingerprint
s to be used withStableHasher
too, and unimplement the rest.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, supporting
combine
directly with StableHasher seems like a good idea -- we presumably want that (or similar) a lot.I'm not sure if combine is "as good" as hashing though, from a "hash quality" perspective. I would sort of assume no because then we wouldn't get any wins from using it...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's another hit:
rust/compiler/rustc_query_system/src/dep_graph/graph.rs
Line 891 in 130359c
... where
DepNode<K>
is aK
and aFingerprint
.I'm inclined to let all of these hash normally for now.
Unhasher
is already a bit hacky itself...There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we wanted though,
DepNode<K>
could ignore itsK
for hashing purposes.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does sound like there's sort of a lot of potential here but the current
Hash/Hasher
API doesn't readily allow specializing like we want to.I was looking at the Hasher API, and it feels like it might be worth adding something like a
write_hash
or similar, where the Hasher can expect the incoming value to already be "hashed" or at least nicely distributed across the range. To start we could just take a u64 since that's whatHasher::finish()
returns, though e.g. for Fingerprint we really want u128. Maybefn write_hash(impl Into<u128>)
makes sense, not sure.I am leaning towards saying that we should just merge this PR as-is: it seems like a clear, if small, win, and while there may be more hidden through careful hash-skipping it's probably better to evaluate each in a standalone manner, particularly given the relative complexity of the Unhasher design. We can consider other improvements, like the one I suggested in the previous paragraph, later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I was just wanting something like this for span's hash implementation, which currently re-hashes a hash of the file name.