Skip to content

Improve hash code of Names #5474

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 8, 2019
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 9 additions & 11 deletions src/reflect/scala/reflect/internal/Names.scala
Original file line number Diff line number Diff line change
Expand Up @@ -47,17 +47,15 @@ trait Names extends api.Names {
/** Hashtable for finding type names quickly. */
private val typeHashtable = new Array[TypeName](HASH_SIZE)

/**
* The hashcode of a name depends on the first, the last and the middle character,
* and the length of the name.
*/
private def hashValue(cs: Array[Char], offset: Int, len: Int): Int =
if (len > 0)
(len * (41 * 41 * 41) +
cs(offset) * (41 * 41) +
cs(offset + len - 1) * 41 +
cs(offset + (len >> 1)))
else 0
private def hashValue(cs: Array[Char], offset: Int, len: Int): Int = {
Copy link
Contributor

@DarkDimius DarkDimius Oct 21, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@retronym do you have a guarantee that strings don't repeat in your name-table?(Dotty does)
If yes, than you won't need to consider character values at all as default hashcode(system identity hashcode) would work correctly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DarkDimius how do you guarantee that strings don't repeat? Let's say you first create a name "abc", then a name "bc", will the name table only contain "abc"?

Also, to me, dotty's Names.scala looks quite similar to scala's. There's a hashValue method that looks the same as the one being replaced in this PR. Can you point out differences you have in mind?

Copy link
Contributor

@DarkDimius DarkDimius Oct 21, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lrytz,

@DarkDimius how do you guarantee that strings don't repeat? Let's say you first create a name "abc", then a name "bc", will the name table only contain "abc"?

It would contain both, but the next time you try to create abc you'll get the same one. See https://github.com/lampepfl/dotty/blob/master/src/dotty/tools/dotc/core/Names.scala#L245

Also, to me, dotty's Names.scala looks quite similar to scala's. There's a hashValue method that looks the same as the one being replaced in this PR. Can you point out differences you have in mind?

hashValue is only used when creating new Term names. It's not used when comparing hashcodes.
Hashcode is https://github.com/lampepfl/dotty/blob/master/src/dotty/tools/dotc/core/Names.scala#L178

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks to me exactly the same as in scala https://github.com/scala/scala/blob/2.12.x/src/reflect/scala/reflect/internal/Names.scala#L233, so I still don't understand where the difference is to dotty..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I misunderstood the intention of this PR. I thought it changes the hashcode.
Dotty has the same issue.

var h = 0
var i = 0
while (i < len) {
h = 31 * h + cs(i + offset)
i += 1
}
h
}

/** Is (the ASCII representation of) name at given index equal to
* cs[offset..offset+len-1]?
Expand Down