-
Notifications
You must be signed in to change notification settings - Fork 18.1k
crypto/sha: implement SHA1 & SHA256 acceleration using Intel SHA extensions #48720
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please visit https://cla.developers.google.com/ to sign. Once you've signed (or fixed any issues), please reply here with What to do if you already signed the CLAIndividual signers
Corporate signers
ℹ️ Googlers: Go here for more info. |
@googlebot I signed it! |
This PR (HEAD: ff02c85) has been imported to Gerrit for code review. Please visit https://go-review.googlesource.com/c/go/+/353402 to see it. Tip: You can toggle comments from me using the |
Message from Go Bot: Patch Set 1: Congratulations on opening your first change. Thank you for your contribution! Next steps: Most changes in the Go project go through a few rounds of revision. This can be During May-July and Nov-Jan the Go project is in a code freeze, during which Please don’t reply on this GitHub thread. Visit golang.org/cl/353402. |
Message from Martin Möhrmann: Patch Set 1: (3 comments) Please don’t reply on this GitHub thread. Visit golang.org/cl/353402. |
ff02c85
to
8a216b4
Compare
This PR (HEAD: 8a216b4) has been imported to Gerrit for code review. Please visit https://go-review.googlesource.com/c/go/+/353402 to see it. Tip: You can toggle comments from me using the |
Message from Dirkjan Bussink: Patch Set 2: (2 comments) Please don’t reply on this GitHub thread. Visit golang.org/cl/353402. |
Message from Martin Möhrmann: Patch Set 2: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/353402. |
This adds the SHA extension flag so it can be used to detect if SHA extensions are available to speed up SHA1 and SHA256 computations.
This adds a new optimized version of SHA1 computation when the SHA Intel extensions are available. Based on the reference documentation at https://software.intel.com/content/www/us/en/develop/articles/intel-sha-extensions.html Benchmarks show a close to 2x performance improvement on an AMD Ryzen 5 3600. name old time/op new time/op delta Sha/SHA1______16_bytes-12 172ns ± 3% 105ns ± 4% -39.06% (p=0.000 n=20+21) Sha/SHA1______64_bytes-12 261ns ± 2% 139ns ± 3% -46.85% (p=0.000 n=21+21) Sha/SHA1_____256_bytes-12 492ns ± 2% 229ns ± 2% -53.41% (p=0.000 n=20+20) Sha/SHA1______1k_bytes-12 1.17µs ± 1% 0.59µs ± 1% -49.43% (p=0.000 n=20+19) Sha/SHA1______8k_bytes-12 7.48µs ± 1% 3.94µs ± 1% -47.35% (p=0.000 n=20+21) Sha/SHA1____256k_bytes-12 232µs ± 2% 122µs ± 1% -47.25% (p=0.000 n=21+20) Sha/SHA1___1024k_bytes-12 928µs ± 2% 491µs ± 1% -47.12% (p=0.000 n=21+21)
This change makes the setup of the SHA256 methods consistent with how it is done for SHA1 and SHA512. This makes it easier to also add the SHA extension implementation in a follow up and makes the code easier to follow for others.
This adds a new optimized version of SHA256 computation when the SHA Intel extensions are available. Based on the reference documentation at https://software.intel.com/content/www/us/en/develop/articles/intel-sha-extensions.html Benchmarks show a close to 4x performance improvement on an AMD Ryzen 5 3600, especially on larger inputs. Even on the smallest it's at least 2x faster. name old time/op new time/op delta Sha/SHA256____16_bytes-12 248ns ± 3% 117ns ± 3% -52.84% (p=0.000 n=20+19) Sha/SHA256____64_bytes-12 384ns ± 2% 153ns ± 3% -60.10% (p=0.000 n=20+17) Sha/SHA256___256_bytes-12 786ns ± 1% 249ns ± 3% -68.29% (p=0.000 n=19+19) Sha/SHA256____1k_bytes-12 2.36µs ± 1% 0.64µs ± 3% -72.93% (p=0.000 n=19+20) Sha/SHA256____8k_bytes-12 17.0µs ± 2% 4.2µs ± 1% -75.16% (p=0.000 n=20+20) Sha/SHA256__256k_bytes-12 537µs ± 1% 131µs ± 1% -75.60% (p=0.000 n=20+20) Sha/SHA256_1024k_bytes-12 2.15ms ± 1% 0.52ms ± 1% -75.60% (p=0.000 n=20+20)
8a216b4
to
3512fd6
Compare
This PR (HEAD: 3512fd6) has been imported to Gerrit for code review. Please visit https://go-review.googlesource.com/c/go/+/353402 to see it. Tip: You can toggle comments from me using the |
Message from Dirkjan Bussink: Patch Set 3: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/353402. |
Message from Dirkjan Bussink: Patch Set 3: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/353402. |
From what I understand, not all comments have been resolved? |
I have addressed & resolved all comments made in Gerrit so far, so not sure what other comments there would be? |
Sorry, I think Gerrit tricked me. Might need attention from @josharian? |
Message from Ben Schwartz: Patch Set 3: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/353402. |
Message from Dirkjan Bussink: Patch Set 3: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/353402. |
Closing as an implementation for SHA256 was submitted by Intel. |
This change implements SHA1 & SHA256 acceleration using the Intel SHA
extensions if those instructions are enabled. First the internal/cpu
package needs support to detect the CPU extension flag for SHA extensions.
The crypto/sha256 package contains a small refactor so that it matches how
crypto/sha1 & crypto/sha512 are set up and it makes it easier to use
consistent code in how it's decided which implementation can be used based
on CPU support.
All the work here is based on the Intel reference documentation at
https://software.intel.com/content/www/us/en/develop/articles/intel-sha-extensions.html.
Using only the reference documentation avoids issue like in #27443 where
code from another project was converted to Go assembly.
Benchmarks show a close to 2x performance improvement on an AMD Ryzen 5
3600 for SHA1:
name old time/op new time/op delta
Sha/SHA1______16_bytes-12 172ns ± 3% 105ns ± 4% -39.06% (p=0.000 n=20+21)
Sha/SHA1______64_bytes-12 261ns ± 2% 139ns ± 3% -46.85% (p=0.000 n=21+21)
Sha/SHA1_____256_bytes-12 492ns ± 2% 229ns ± 2% -53.41% (p=0.000 n=20+20)
Sha/SHA1______1k_bytes-12 1.17µs ± 1% 0.59µs ± 1% -49.43% (p=0.000 n=20+19)
Sha/SHA1______8k_bytes-12 7.48µs ± 1% 3.94µs ± 1% -47.35% (p=0.000 n=20+21)
Sha/SHA1____256k_bytes-12 232µs ± 2% 122µs ± 1% -47.25% (p=0.000 n=21+20)
Sha/SHA1___1024k_bytes-12 928µs ± 2% 491µs ± 1% -47.12% (p=0.000 n=21+21)
Benchmarks show a close to 4x performance improvement on an AMD Ryzen 5
3600, especially on larger inputs. Even on the smallest it's at least 2x
faster.
name old time/op new time/op delta
Sha/SHA256____16_bytes-12 248ns ± 3% 117ns ± 3% -52.84% (p=0.000 n=20+19)
Sha/SHA256____64_bytes-12 384ns ± 2% 153ns ± 3% -60.10% (p=0.000 n=20+17)
Sha/SHA256___256_bytes-12 786ns ± 1% 249ns ± 3% -68.29% (p=0.000 n=19+19)
Sha/SHA256____1k_bytes-12 2.36µs ± 1% 0.64µs ± 3% -72.93% (p=0.000 n=19+20)
Sha/SHA256____8k_bytes-12 17.0µs ± 2% 4.2µs ± 1% -75.16% (p=0.000 n=20+20)
Sha/SHA256__256k_bytes-12 537µs ± 1% 131µs ± 1% -75.60% (p=0.000 n=20+20)
Sha/SHA256_1024k_bytes-12 2.15ms ± 1% 0.52ms ± 1% -75.60% (p=0.000 n=20+20)
The discussion in #27443 mentions that for SHA1 including this was
debatable, since the algorithm itself is no longer considered safe. I think
that a 2x performance improvement is still significant though and there's
still a lot of places where SHA1 is still used (for example for Git itself).
Of course the SHA1 change can be backed out if this change is only desired
for SHA256 because of that.
Fixes #27443