-
Notifications
You must be signed in to change notification settings - Fork 13.7k
Description
(via rust-lang/hashbrown#218 (comment))
It seems ahash gets different values for a rotated (non-contiguous) VecDeque
:
use ahash::AHasher;
use std::collections::VecDeque;
use std::hash::{Hash, Hasher};
fn hash(x: &impl Hash) -> u64 {
let mut hasher = AHasher::default();
Hash::hash(x, &mut hasher);
hasher.finish()
}
fn main() {
let deque: VecDeque<i32> = (0..10).collect();
let mut deque2 = deque.clone();
deque2.rotate_left(5);
deque2.rotate_left(5);
assert_eq!(deque, deque2);
assert_eq!(hash(&deque), hash(&deque2)); // fails!
}
Hash for VecDeque
hashes the length and then each part into Hash::hash_slice
:
rust/library/alloc/src/collections/vec_deque/mod.rs
Lines 2650 to 2657 in 0fe1dc6
impl<A: Hash> Hash for VecDeque<A> { | |
fn hash<H: Hasher>(&self, state: &mut H) { | |
self.len().hash(state); | |
let (a, b) = self.as_slices(); | |
Hash::hash_slice(a, state); | |
Hash::hash_slice(b, state); | |
} | |
} |
Hash::hash_slice
for integers casts to &[u8]
and calls Hasher::write
:
rust/library/core/src/hash/mod.rs
Lines 555 to 563 in 0fe1dc6
fn hash_slice<H: Hasher>(data: &[$ty], state: &mut H) { | |
let newlen = data.len() * mem::size_of::<$ty>(); | |
let ptr = data.as_ptr() as *const u8; | |
// SAFETY: `ptr` is valid and aligned, as this macro is only used | |
// for numeric primitives which have no padding. The new slice only | |
// spans across `data` and is never mutated, and its total size is the | |
// same as the original `data` so it can't be over `isize::MAX`. | |
state.write(unsafe { slice::from_raw_parts(ptr, newlen) }) | |
} |
And finally, AHasher::write
feeds the length of that slice in first, so it is sensitive to how the deque is split up!
https://github.com/tkaitchuck/aHash/blob/b175b79659da44a85a01d27cb1f9e697bcaff89a/src/fallback_hash.rs#L159-L163
https://github.com/tkaitchuck/aHash/blob/b175b79659da44a85a01d27cb1f9e697bcaff89a/src/aes_hash.rs#L147-L150
I think the Hash
and Hasher
behavior is underspecified here. I would assume that Hash::hash_slice
is supposed to have the same effect as just hashing each item in the slice separately, without the length, as the default implementation does. But what about Hasher::write
-- is write(&[1, 2, 3, 4])
supposed to match write(&[1, 2]); write(&[3, 4]);
?
- If both
Hash::hash_slice
andHasher::write
may consider length, thenHash for VecDeque
is incorrect. - If
Hash::hash_slice
should hash without length, butHasher::write
is allowed to use length, then the integer optimization is incorrect. - If both
Hash::hash_slice
andHasher::write
are supposed to be independent of length, hashing identically whether it's one big slice or any partitioned slices of the same data, thenAHasher::write
is incorrect.