Skip to content

Support for custom hasher for std groupping #783

@a14e

Description

@a14e

Overview

I want to suggest an improvement for the library -- the ability to use custom hashes (specifically interested in ahash).

Why?

The standard library uses a sip hash, which is protected against various types of attacks on hash tables, but it is known to be quite slow. Alternative hashes can be 10+ times faster (for example, there is a benchmark example in ahash)

What difficulties might arise?

It's unclear how to implement this

  1. Option 1. Keep the current implementation, but add a third argument with a default value, for example, the methods would look like this:
    #[cfg(feature = "use_std")]
        fn into_group_map<K, V, R = RandomState>(self) -> HashMap<K, Vec<V>, R>
        where
            Self: Iterator<Item = (K, V)> + Sized,
            K: Hash + Eq,
            R: BuildHasher + Default
        {
            group_map::into_group_map(self)
        }
    but the current Rust doesn’t know how to do this and writes something like
    error: defaults for type parameters are only allowed in `struct`, `enum`, `type`, or `trait` definitions
        --> src/lib.rs:3076:29
         |
    3076 |     fn into_group_map<K, V, R = RandomState>(self) -> HashMap<K, Vec<V>, R>
         |                             ^^^^^^^^^^^^^^^
         |
         = warning: this was previously accepted by the compiler but is being phased out; it will become a hard error in a future release!
         = note: for more information, see issue #36887 <https://github.com/rust-lang/rust/issues/36887>
         = note: `#[deny(invalid_type_param_default)]` on by default
  2. Option 2, we can try replacing all imports with a flag from std::collections::HashMap to ahash::HashMap, but this breaks backward compatibility

Maybe there are some other options?

Activity

a14e

a14e commented on Oct 16, 2023

@a14e
Author

This issue seems to be a duplicate of #322.

a14e

a14e commented on Oct 16, 2023

@a14e
Author

Other possible options:

  1. Generalize the current implementation and create duplicates of methods with custom hashers (for example, HashMap has a constructor new and with_hasher).
  2. Create some kind of middleware, and call, for instance, vec![1, 2, 3].iter().grouping::<ahash::HashMap>, and implement all grouping options on top of this middleware. What do you think?
Philippe-Cholet

Philippe-Cholet commented on Oct 20, 2023

@Philippe-Cholet
Member

As a maintainer or not, I certainly would like to customize the hasher too.
Related to #462 which contains comments written by phimuemue that I agree with and that could interest you.

Maybe we should create a label for this kind of issues?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @a14e@Philippe-Cholet

        Issue actions

          Support for custom hasher for std groupping · Issue #783 · rust-itertools/itertools