Skip to content

crypto/sha: implement SHA1 & SHA256 acceleration using Intel SHA extensions #48720

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

dbussink
Copy link

@dbussink dbussink commented Oct 1, 2021

This change implements SHA1 & SHA256 acceleration using the Intel SHA
extensions if those instructions are enabled. First the internal/cpu
package needs support to detect the CPU extension flag for SHA extensions.

The crypto/sha256 package contains a small refactor so that it matches how
crypto/sha1 & crypto/sha512 are set up and it makes it easier to use
consistent code in how it's decided which implementation can be used based
on CPU support.

All the work here is based on the Intel reference documentation at
https://software.intel.com/content/www/us/en/develop/articles/intel-sha-extensions.html.
Using only the reference documentation avoids issue like in #27443 where
code from another project was converted to Go assembly.

Benchmarks show a close to 2x performance improvement on an AMD Ryzen 5
3600 for SHA1:

name old time/op new time/op delta
Sha/SHA1______16_bytes-12 172ns ± 3% 105ns ± 4% -39.06% (p=0.000 n=20+21)
Sha/SHA1______64_bytes-12 261ns ± 2% 139ns ± 3% -46.85% (p=0.000 n=21+21)
Sha/SHA1_____256_bytes-12 492ns ± 2% 229ns ± 2% -53.41% (p=0.000 n=20+20)
Sha/SHA1______1k_bytes-12 1.17µs ± 1% 0.59µs ± 1% -49.43% (p=0.000 n=20+19)
Sha/SHA1______8k_bytes-12 7.48µs ± 1% 3.94µs ± 1% -47.35% (p=0.000 n=20+21)
Sha/SHA1____256k_bytes-12 232µs ± 2% 122µs ± 1% -47.25% (p=0.000 n=21+20)
Sha/SHA1___1024k_bytes-12 928µs ± 2% 491µs ± 1% -47.12% (p=0.000 n=21+21)

Benchmarks show a close to 4x performance improvement on an AMD Ryzen 5
3600, especially on larger inputs. Even on the smallest it's at least 2x
faster.

name old time/op new time/op delta
Sha/SHA256____16_bytes-12 248ns ± 3% 117ns ± 3% -52.84% (p=0.000 n=20+19)
Sha/SHA256____64_bytes-12 384ns ± 2% 153ns ± 3% -60.10% (p=0.000 n=20+17)
Sha/SHA256___256_bytes-12 786ns ± 1% 249ns ± 3% -68.29% (p=0.000 n=19+19)
Sha/SHA256____1k_bytes-12 2.36µs ± 1% 0.64µs ± 3% -72.93% (p=0.000 n=19+20)
Sha/SHA256____8k_bytes-12 17.0µs ± 2% 4.2µs ± 1% -75.16% (p=0.000 n=20+20)
Sha/SHA256__256k_bytes-12 537µs ± 1% 131µs ± 1% -75.60% (p=0.000 n=20+20)
Sha/SHA256_1024k_bytes-12 2.15ms ± 1% 0.52ms ± 1% -75.60% (p=0.000 n=20+20)

The discussion in #27443 mentions that for SHA1 including this was
debatable, since the algorithm itself is no longer considered safe. I think
that a 2x performance improvement is still significant though and there's
still a lot of places where SHA1 is still used (for example for Git itself).
Of course the SHA1 change can be backed out if this change is only desired
for SHA256 because of that.

Fixes #27443

@google-cla
Copy link

google-cla bot commented Oct 1, 2021

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

@google-cla google-cla bot added the cla: no Used by googlebot to label PRs as having an invalid CLA. The text of this label should not change. label Oct 1, 2021
@dbussink
Copy link
Author

dbussink commented Oct 1, 2021

@googlebot I signed it!

@google-cla google-cla bot added cla: yes Used by googlebot to label PRs as having a valid CLA. The text of this label should not change. and removed cla: no Used by googlebot to label PRs as having an invalid CLA. The text of this label should not change. labels Oct 1, 2021
@gopherbot
Copy link
Contributor

This PR (HEAD: ff02c85) has been imported to Gerrit for code review.

Please visit https://go-review.googlesource.com/c/go/+/353402 to see it.

Tip: You can toggle comments from me using the comments slash command (e.g. /comments off)
See the Wiki page for more info

@gopherbot
Copy link
Contributor

Message from Go Bot:

Patch Set 1:

Congratulations on opening your first change. Thank you for your contribution!

Next steps:
A maintainer will review your change and provide feedback. See
https://golang.org/doc/contribute.html#review for more info and tips to get your
patch through code review.

Most changes in the Go project go through a few rounds of revision. This can be
surprising to people new to the project. The careful, iterative review process
is our way of helping mentor contributors and ensuring that their contributions
have a lasting impact.

During May-July and Nov-Jan the Go project is in a code freeze, during which
little code gets reviewed or merged. If a reviewer responds with a comment like
R=go1.11 or adds a tag like "wait-release", it means that this CL will be
reviewed as part of the next development cycle. See https://golang.org/s/release
for more details.


Please don’t reply on this GitHub thread. Visit golang.org/cl/353402.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Martin Möhrmann:

Patch Set 1:

(3 comments)


Please don’t reply on this GitHub thread. Visit golang.org/cl/353402.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

This PR (HEAD: 8a216b4) has been imported to Gerrit for code review.

Please visit https://go-review.googlesource.com/c/go/+/353402 to see it.

Tip: You can toggle comments from me using the comments slash command (e.g. /comments off)
See the Wiki page for more info

@gopherbot
Copy link
Contributor

Message from Dirkjan Bussink:

Patch Set 2:

(2 comments)


Please don’t reply on this GitHub thread. Visit golang.org/cl/353402.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Martin Möhrmann:

Patch Set 2:

(1 comment)


Please don’t reply on this GitHub thread. Visit golang.org/cl/353402.
After addressing review feedback, remember to publish your drafts!

This adds the SHA extension flag so it can be used to detect if SHA
extensions are available to speed up SHA1 and SHA256 computations.
This adds a new optimized version of SHA1 computation when the SHA Intel
extensions are available.

Based on the reference documentation at
https://software.intel.com/content/www/us/en/develop/articles/intel-sha-extensions.html

Benchmarks show a close to 2x performance improvement on an AMD Ryzen 5
3600.

name                       old time/op  new time/op  delta
Sha/SHA1______16_bytes-12   172ns ± 3%   105ns ± 4%  -39.06%  (p=0.000 n=20+21)
Sha/SHA1______64_bytes-12   261ns ± 2%   139ns ± 3%  -46.85%  (p=0.000 n=21+21)
Sha/SHA1_____256_bytes-12   492ns ± 2%   229ns ± 2%  -53.41%  (p=0.000 n=20+20)
Sha/SHA1______1k_bytes-12  1.17µs ± 1%  0.59µs ± 1%  -49.43%  (p=0.000 n=20+19)
Sha/SHA1______8k_bytes-12  7.48µs ± 1%  3.94µs ± 1%  -47.35%  (p=0.000 n=20+21)
Sha/SHA1____256k_bytes-12   232µs ± 2%   122µs ± 1%  -47.25%  (p=0.000 n=21+20)
Sha/SHA1___1024k_bytes-12   928µs ± 2%   491µs ± 1%  -47.12%  (p=0.000 n=21+21)
This change makes the setup of the SHA256 methods consistent with how it
is done for SHA1 and SHA512. This makes it easier to also add the SHA
extension implementation in a follow up and makes the code easier to
follow for others.
This adds a new optimized version of SHA256 computation when the SHA Intel
extensions are available.

Based on the reference documentation at
https://software.intel.com/content/www/us/en/develop/articles/intel-sha-extensions.html

Benchmarks show a close to 4x performance improvement on an AMD Ryzen 5
3600, especially on larger inputs. Even on the smallest it's at least 2x
faster.

name                       old time/op  new time/op  delta
Sha/SHA256____16_bytes-12   248ns ± 3%   117ns ± 3%  -52.84%  (p=0.000 n=20+19)
Sha/SHA256____64_bytes-12   384ns ± 2%   153ns ± 3%  -60.10%  (p=0.000 n=20+17)
Sha/SHA256___256_bytes-12   786ns ± 1%   249ns ± 3%  -68.29%  (p=0.000 n=19+19)
Sha/SHA256____1k_bytes-12  2.36µs ± 1%  0.64µs ± 3%  -72.93%  (p=0.000 n=19+20)
Sha/SHA256____8k_bytes-12  17.0µs ± 2%   4.2µs ± 1%  -75.16%  (p=0.000 n=20+20)
Sha/SHA256__256k_bytes-12   537µs ± 1%   131µs ± 1%  -75.60%  (p=0.000 n=20+20)
Sha/SHA256_1024k_bytes-12  2.15ms ± 1%  0.52ms ± 1%  -75.60%  (p=0.000 n=20+20)
@gopherbot
Copy link
Contributor

This PR (HEAD: 3512fd6) has been imported to Gerrit for code review.

Please visit https://go-review.googlesource.com/c/go/+/353402 to see it.

Tip: You can toggle comments from me using the comments slash command (e.g. /comments off)
See the Wiki page for more info

@gopherbot
Copy link
Contributor

Message from Dirkjan Bussink:

Patch Set 3:

(1 comment)


Please don’t reply on this GitHub thread. Visit golang.org/cl/353402.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Dirkjan Bussink:

Patch Set 3:

(1 comment)


Please don’t reply on this GitHub thread. Visit golang.org/cl/353402.
After addressing review feedback, remember to publish your drafts!

@andig
Copy link
Contributor

andig commented Nov 6, 2021

From what I understand, not all comments have been resolved?

@dbussink
Copy link
Author

dbussink commented Nov 6, 2021

From what I understand, not all comments have been resolved?

I have addressed & resolved all comments made in Gerrit so far, so not sure what other comments there would be?

@andig
Copy link
Contributor

andig commented Nov 7, 2021

Sorry, I think Gerrit tricked me. Might need attention from @josharian?

@gopherbot
Copy link
Contributor

Message from Ben Schwartz:

Patch Set 3:

(1 comment)


Please don’t reply on this GitHub thread. Visit golang.org/cl/353402.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Dirkjan Bussink:

Patch Set 3:

(1 comment)


Please don’t reply on this GitHub thread. Visit golang.org/cl/353402.
After addressing review feedback, remember to publish your drafts!

@dbussink
Copy link
Author

dbussink commented Apr 1, 2023

Closing as an implementation for SHA256 was submitted by Intel.

@dbussink dbussink closed this Apr 1, 2023
@dbussink dbussink deleted the sha-acceleration branch April 1, 2023 06:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes Used by googlebot to label PRs as having a valid CLA. The text of this label should not change.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

crypto/sha1: add native SHA1 instruction implementation for AMD64
3 participants