-
Notifications
You must be signed in to change notification settings - Fork 298
Look into using a better hash function #48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I would humbly submit aHash as a possible alternative: https://github.com/tkaitchuck/aHash |
@tkaitchuck aHash looks great, except for the part where it is not always supported by the CPU. Unfortunately the hash function is called often enough that doing dynamic dispatch based on the CPU features would cost too much performance. |
It ought to be possible to do statically with no runtime overhead. For example it is possible define something like:
Then to the calling code there is always one implementation of the trait, statically dispatched. Which one get's compiled just depends of which architecture you are targeting. If that sounds interesting I could take a stab at implementing a fallback inside of the aHash repo. |
I've added a fallback algorithm to aHash. The details are in the Readme. (It was inspired by FxHash), it's also keyed and should be considerably harder to attack than MermerHash2 was (or fx which is trivial). I'm still not totally happy with it's security or performance, but it's still better than anything else I'm aware of at that level of performance. Take a look. I'll continue to tweak it. |
@Amanieu Do you have any thoughts? The aHash fallback algorithm speed now beats FXHash at any string longer than 4 bytes. It also is within a nano second for all primitives. (Which I think is as good as can be done. FXHash really plays fast and loose when hashing primitives. I can easily imagine a HashMap<u64, _> where the key is a 'size' or 'location' in bytes which would turn the map into a list if the values happened to be a whole number of KB.) If AESNI is available, aHash will beat FX's speed at everything except primitives (only off by 1/3rd of a ns) and strings of exactly 4 bytes. The benchmarks included in the README are against the generic code path, so as you can see the dispatch is done completely at compile time, and it adds no overhead. PS: Do you know how the snippet you have at the top performs on non-x86? I considered such an approach in the fallback, knowing it would get translated to a 32/64 bit multiply on Intel, but I was afraid that on other architectures it might turn into a 128bit multiply, which could really suck. -- UPDATE: It appears ARM and WebAssembly don't have such an instruction. |
@tkaitchuck One of the major use cases for hash table is with integer keys. Can you provide some benchmarks when hashing simple |
Ah nevermind, I see the results for |
(Updated)
There is a bit of a performance hit because all those run with i32 keys which are just not as fast in aHash as in FX. A few things seem notable:
I could improve the performance of the fallback hasher on primitive types like i32 if it could be guaranteed the the hasher is only ever passed a single value. The only way I can think to make such a thing work, is to have the user use a macro to instantiate the map, where the macro looks at the key and creates one hasherBuilder for known primitives and a different one if it is something else. This strikes me a too invasive, as it requires changing users' code. |
@Amanieu I figured out a way to speed up primitives (without changing the actual algorithm!), and released 0.1.13. (benchmarks are in the README) and I updated my comment above to reflect the impact on hashbrown using it. |
@Amanieu what is the next step? Should I create a pull request? Do you want to see some string benchmarks? A comparison of a number of different hashes? |
Sorry for the late response, I've been busy with some things lately. I am hesitant to use aHash directly as the default hasher because it is slower than the current hasher. However if it is as secure as SipHash it might be a good idea to use it as a replacement for the default hasher in libstd. |
Update: I was able to improve the performance a the fallback algorithm a bit. I've also submitted a PR for the additional benchmarks I've been comparing with. Here are what I observe locally (FxHash on the left AHash on the right.):
With this new version, the performance gap is a lot narrower, and with strings > 4 characters aHash is faster. Also FxHash really falls apart on the find_existing_high_bits test. I realize that isn't "normal" data but as a user I would never expect that level of performance drop (vs find_existing). @Amanieu Is there a good way to actually get a representative sample of what users actually USE as their keys? That would allow us to make a benchmark that is a lot less abstract. |
56: Add additional benchmarks. r=Amanieu a=tkaitchuck This covers performance of three cases I wanted to study when looking into https://github.com/Amanieu/hashbrown/issues/48 They are: `grow_by_insertion_kb` which is similar to grow by insertion, but instead of every entry differing by 1, they differ by 1024. This makes an important performance difference to the hasher. `find_existing_high_bits` which is similar to find_existing but uses 64 bit keys instead of 32 bit keys, where the lower 32 bits are zeros. This is a pathologically bad case for FxHash. `insert_8_char_string` tests a case where the key is a string. (As opposed to all the existing tests which operate on u32 values. This is important to cover because strings as keys are very common. Co-authored-by: Tom Kaitchuck <[email protected]>
@tkaitchuck Your benchmarks (in the |
@Zoxc AHash's readme refers to the public crate. The above was using the version in this repo.
(Still not as good as aHash with or without fallback for strings, but much better overall) |
56: Add additional benchmarks. r=Amanieu a=tkaitchuck This covers performance of three cases I wanted to study when looking into https://github.com/Amanieu/hashbrown/issues/48 They are: `grow_by_insertion_kb` which is similar to grow by insertion, but instead of every entry differing by 1, they differ by 1024. This makes an important performance difference to the hasher. `find_existing_high_bits` which is similar to find_existing but uses 64 bit keys instead of 32 bit keys, where the lower 32 bits are zeros. This is a pathologically bad case for FxHash. `insert_8_char_string` tests a case where the key is a string. (As opposed to all the existing tests which operate on u32 values. This is important to cover because strings as keys are very common. 62: Remove incorrect debug_assert r=Amanieu a=Amanieu Fixes #60 Co-authored-by: Tom Kaitchuck <[email protected]> Co-authored-by: Amanieu d'Antras <[email protected]>
After thinking about this a bit and looking at alternatives, I think that AHash would be great as the default hasher for hashbrown. However there are a few issues that currently prevent me from doing so:
|
ping @tkaitchuck |
@alkis there is an open issue to do that here: tkaitchuck/aHash#6 |
@Amanieu aHash now builds on stable and is no-std. |
Replace FxHash with AHash as the default hasher Fixes #48 cc @tkaitchuck
Cyan4973/xxHash#155 |
I haven't compared it in a benchmark with ahash, but looking at the code, it's very clearly designed for throughput not latency. It might even beat ahash when dealing with 1mb of data (Though I would not be too confident). But that's rarely the case with a hashmap in memory. A hashmap needs to be able to get a value in just a couple of CPU cycles when dealing with a u64 or a 5 byte string for a key. That case is far more common. |
Are you talking about meow or XXH3? If the latter I find this surprising because the blogpost I linked above specifically discusses latency for small inputs as a specific design goal. |
No I wasn't. I see that. H3 now adds a fast path for short inputs, and it looks eerily familiar. It uses many of the same tricks I do in aHash. It says a lot about the constraints of the problem that independent solutions can endup so similar. |
We currently use
FxHash
as the default hash function, but this function handles aligned values poorly: when hashing an integer, if the low X bits of the input value are 0 then the low X bits of the hash value will also be 0.One option would be to copy Google's CityHash (used by SwissTable). This uses a long multiple and XORs the top and bottom words of the result together:
cc rust-lang/rust#58249
The text was updated successfully, but these errors were encountered: