Skip to content

Use hardened Sha1 implementations (collision detection) #585

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
3 tasks
Byron opened this issue Nov 7, 2022 · 7 comments · Fixed by #1915
Closed
3 tasks

Use hardened Sha1 implementations (collision detection) #585

Byron opened this issue Nov 7, 2022 · 7 comments · Fixed by #1915
Labels
acknowledged an issue is accepted as shortcoming to be fixed

Comments

@Byron
Copy link
Member

Byron commented Nov 7, 2022

At least add a feature toggle to support this crate:

@pascalkuthe
Copy link
Contributor

pascalkuthe commented Nov 7, 2022

Some additional details:

  • git uses hardened sha1 by default so there might be a chance that gitoxide messes up an object graph by generating different hashes then git does (needs investigation)
  • this case is extremely rare as the probability of differing in honest usage of less than 2 90
  • sha1collisiondetection is a straight port from C to rust using c2rust without much modifications on top. So it is insanely unsafe (nearly 100% unsafe code). Its likely also slower then the current sha1 algorithm (implemented in assembly).
  • a paper describing the algrithm can be found here.
  • GitHub an other hosting provides likely place a high priority on this (for example https://github.blog/2017-03-20-sha-1-collision-detection-on-github-com/)

I think implementing this in the sha-1 crate would be the best option but also extremely challenging (its partly written in assembly). Perhaps adding this to the sha1_soml crate to the rust version of the sha crate (so with the asm flag disabled) first would be easier (and adding some kind of safety feature). That library only contains two unsafe blocks so it might be preferable from a security point of view and also easier to contribute to (it also does not contain any assembly).

@Byron
Copy link
Member Author

Byron commented Nov 7, 2022

I was wondering what should happen once a collision is detected, as it's critical to do the same thing that git does in that case.

Options are…

  1. fail and abort the operation that caused the collision, effectively preventing the object to be added
  2. generate a different, safer hash instead

It looks like git is taking approach number 2, and creates a different hash instead. By the looks of it, they can control various flags related to collision detection at runtime, but initialize these to be as safe as possible (detect collision, do unavoidable attack condition checks, and generate a safe hash with additional rounds).

For our implementation to be valid, we'd have to be sure that our safe hashes are indeed the same as the ones that git produces. Unfortunately I wasn't able to find a test there.

libgit2 has a couple of tests fortunately, and the implementation git borrowed its SHA1 detection code from has some tests as well.

With that, it feels like it's really up to gitoxide on how to implement it, with the bare minimum being the detection of collision so errors can be produced. Unfortunately there is no configuration value to determine if safe hashes should be used or not, so gitoxide can turn it on by default like git seems to, with ways to turn it off as well (to produce errors instead).

@Byron Byron added the acknowledged an issue is accepted as shortcoming to be fixed label Nov 19, 2022
@dignifiedquire
Copy link

Fyi I have a draft of this here open now: RustCrypto/hashes#566 would be great to get feedback if this will work for gitoxide, and if not what needs to change

@Byron
Copy link
Member Author

Byron commented Feb 23, 2024

Thanks for the update, thanks so much for this amazing and exciting work! From what I can tell, gitoxide can configure collision detection merely by turning on the new feature toggle, which then enables it for sha1::Sha1. This will definitely work.

Something I wonder is if I will would be able to choose at runtime which implementation to use - I can imagine the new one to be quite a bit slower by merit of not being in ASM, so maybe I would want to have a git-configuration option for it rather than a compile-time-only option.

@dignifiedquire
Copy link

Something I wonder is if I will would be able to choose at runtime which implementation to use - I can imagine the new one to be quite a bit slower by merit of not being in ASM, so maybe I would want to have a git-configuration option for it rather than a compile-time-only option.

I have updated the PR to allow for both options to be compiled in, and making the switch much more explicit

@dignifiedquire
Copy link

The build options for git, and its various uses of sha1 implementations can be found here: https://github.com/git/git/blob/master/Makefile#L480-L534

TLDR: use a fast optimized implementation, unless specific build flags are specified. So my understanding is that they effectively do not use the collision detection code in any of the default builds, which is quite surprising to me. Not sure where else to look for confirmation on that.

In regards to how to setup a sha1 implementation that is "good enough" for gitoxide, probably the following makes sense

Combine the following three crates,

  • the new sha1-checked
  • the existing [email protected]
  • asm fallbacks using sha1-asm until they have been migrated to inline assembly into sha1

This should give roughly matching speeds with git asfaiu, assuming the same configurations.

Longer term there is likely an opportunity to optimize the sha1-checked using some assembly code (though as I wrote in the initial PR, likely not using the intrinsics).

@Byron
Copy link
Member Author

Byron commented Mar 5, 2024

Thanks a lot for the analysis, this makes sense to me and should be good enough.

And I hope one day gitoxide can take more responsibility for this dependency given its importance, to have something like gix-sha1 that can do it all just like Git (and with similar performance).

emilazy added a commit to emilazy/gitoxide that referenced this issue Mar 31, 2025
Fix [GHSA-2frx-2596-x5r6].

[GHSA-2frx-2596-x5r6]: GHSA-2frx-2596-x5r6

This uses the `sha1-checked` crate from the RustCrypto project. It’s
a pure Rust implementation, with no SIMD or assembly code.

Raw hashing is somewhere around 0.25× to 0.65× the speed of the
previous implementation, depending on the feature configuration
and whether the CPU supports hardware‐accelerated hashing. (The
more portable assembly in `sha1-asm` that doesn’t require the SHA
instruction set doesn’t seem to speed things up that much; in fact,
`sha1_smol` somehow regularly beats the assembly code used by `sha1`
on my i9‐9880H MacBook Pro! Presumably this is why that path was
removed in newer versions of the `sha1` crate.)

Performance on an end‐to‐end `gix no-repo pack verify` benchmark
using pack files from the Linux kernel Git server measures around
0.41× to 0.44× compared to the base commit on an M2 Max and a
Ryzen 7 5800X, both of which have hardware instructions for SHA‐1
acceleration that the previous implementation uses but this one does
not. On the i9‐9880H, it’s around 0.58× to 0.60× the speed;
the slow‐down is reduced by the older hardware’s lack of SHA‐1
instructions.

The `sha1collisiondetection` crate from the Sequoia PGP project,
based on a modified C2Rust translation of the library used by Git,
was also considered; although its raw hashing performance seems
to measure around 1.12–1.15× the speed of `sha1-checked` on
x86, it’s indistinguishable from noise on the end‐to‐end
benchmark, and on an M2 Max `sha1-checked` is consistently
around 1.03× the speed of `sha1collisiondetection` on that
benchmark. The `sha1collisiondetection` crate has also had a
soundness issue in the past due to the automatic C translation,
whereas `sha1-checked` has only one trivial `unsafe` block. On the
other hand, `sha1collisiondetection` is used by both Sequoia itself
and the `gitoid` crate, whereas rPGP is the only major user of
`sha1-checked`. I don’t think there’s a clear winner here.

The performance regression is very unfortunate, but the [SHAttered]
attack demonstrated a collision back in 2017, and the 2020 [SHA‐1 is
a Shambles] attack demonstrated a practical chosen‐prefix collision
that broke the use of SHA‐1 in OpenPGP, costing $75k to perform,
with an estimate of $45k to replicate at the time of publication and
$11k for a classical collision.

[SHAttered]: https://shattered.io/
[SHA‐1 is a Shambles]: https://sha-mbles.github.io/

Given the increase in GPU performance and production since then,
that puts the Git object format squarely at risk. Git mitigated this
attack in 2017; the algorithm is fairly general and detects all the
existing public collisions. My understanding is that an entirely new
cryptanalytic approach would be required to develop a collision attack
for SHA‐1 that would not be detected with very high probability.

I believe that the speed penalty could be mitigated, although not
fully eliminated, by implementing a version of the hardened SHA‐1
function that makes use of SIMD. For instance, the assembly code used
by `openssl speed sha1` on my i9‐9880H measures around 830 MiB/s,
compared to the winning 580 MiB/s of `sha1_smol`; adding collision
detection support to that would surely incur a performance penalty,
but it is likely that it could be much more competitive with
the performance before this commit than the 310 MiB/s I get with
`sha1-checked`. I haven’t been able to find any existing work on
this; it seems that more or less everyone just uses the original
C library that Git does, presumably because nothing except Git and
OpenPGP is still relying on SHA‐1 anyway…

The performance will never compete with the >2 GiB/s that can
be achieved with the x86 SHA instruction set extension, as the
`SHA1RNDS4` instruction sadly runs four rounds at a time while the
collision detection algorithm requires checks after every round,
but I believe SIMD would still offer a significant improvement,
and the AArch64 extension seems like it may be more flexible.

I know that these days the Git codebase has an additional faster
unsafe API without these checks that it tries to carefully use only
for operations that do not depend on hashing results for correctness
or safety. I personally believe that’s not a terribly good idea,
as it seems easy to misuse in a case where correctness actually does
matter, but maybe that’s just my Rust safety bias talking. I think
it would be better to focus on improving the performance of the safer
algorithm, as I think that many of the operations where the performance
penalty is the most painful are dealing with untrusted input anyway.

The `Hasher` struct gets a lot bigger; I don’t know if this is
an issue or not, but if it is, it could potentially be boxed.

Closes: GitoxideLabs#585
emilazy added a commit to emilazy/gitoxide that referenced this issue Mar 31, 2025
Fix [GHSA-2frx-2596-x5r6].

[GHSA-2frx-2596-x5r6]: GHSA-2frx-2596-x5r6

This uses the `sha1-checked` crate from the RustCrypto project. It’s
a pure Rust implementation, with no SIMD or assembly code.

Raw hashing is somewhere around 0.25× to 0.65× the speed of the
previous implementation, depending on the feature configuration
and whether the CPU supports hardware‐accelerated hashing. (The
more portable assembly in `sha1-asm` that doesn’t require the SHA
instruction set doesn’t seem to speed things up that much; in fact,
`sha1_smol` somehow regularly beats the assembly code used by `sha1`
on my i9‐9880H MacBook Pro! Presumably this is why that path was
removed in newer versions of the `sha1` crate.)

Performance on an end‐to‐end `gix no-repo pack verify` benchmark
using pack files from the Linux kernel Git server measures around
0.41× to 0.44× compared to the base commit on an M2 Max and a
Ryzen 7 5800X, both of which have hardware instructions for SHA‐1
acceleration that the previous implementation uses but this one does
not. On the i9‐9880H, it’s around 0.58× to 0.60× the speed;
the slowdown is reduced by the older hardware’s lack of SHA‐1
instructions.

The `sha1collisiondetection` crate from the Sequoia PGP project,
based on a modified C2Rust translation of the library used by Git,
was also considered; although its raw hashing performance seems
to measure around 1.12–1.15× the speed of `sha1-checked` on
x86, it’s indistinguishable from noise on the end‐to‐end
benchmark, and on an M2 Max `sha1-checked` is consistently
around 1.03× the speed of `sha1collisiondetection` on that
benchmark. The `sha1collisiondetection` crate has also had a
soundness issue in the past due to the automatic C translation,
whereas `sha1-checked` has only one trivial `unsafe` block. On the
other hand, `sha1collisiondetection` is used by both Sequoia itself
and the `gitoid` crate, whereas rPGP is the only major user of
`sha1-checked`. I don’t think there’s a clear winner here.

The performance regression is very unfortunate, but the [SHAttered]
attack demonstrated a collision back in 2017, and the 2020 [SHA‐1 is
a Shambles] attack demonstrated a practical chosen‐prefix collision
that broke the use of SHA‐1 in OpenPGP, costing $75k to perform,
with an estimate of $45k to replicate at the time of publication and
$11k for a classical collision.

[SHAttered]: https://shattered.io/
[SHA‐1 is a Shambles]: https://sha-mbles.github.io/

Given the increase in GPU performance and production since then,
that puts the Git object format squarely at risk. Git mitigated this
attack in 2017; the algorithm is fairly general and detects all the
existing public collisions. My understanding is that an entirely new
cryptanalytic approach would be required to develop a collision attack
for SHA‐1 that would not be detected with very high probability.

I believe that the speed penalty could be mitigated, although not
fully eliminated, by implementing a version of the hardened SHA‐1
function that makes use of SIMD. For instance, the assembly code used
by `openssl speed sha1` on my i9‐9880H measures around 830 MiB/s,
compared to the winning 580 MiB/s of `sha1_smol`; adding collision
detection support to that would surely incur a performance penalty,
but it is likely that it could be much more competitive with
the performance before this commit than the 310 MiB/s I get with
`sha1-checked`. I haven’t been able to find any existing work on
this; it seems that more or less everyone just uses the original
C library that Git does, presumably because nothing except Git and
OpenPGP is still relying on SHA‐1 anyway…

The performance will never compete with the >2 GiB/s that can
be achieved with the x86 SHA instruction set extension, as the
`SHA1RNDS4` instruction sadly runs four rounds at a time while the
collision detection algorithm requires checks after every round,
but I believe SIMD would still offer a significant improvement,
and the AArch64 extension seems like it may be more flexible.

I know that these days the Git codebase has an additional faster
unsafe API without these checks that it tries to carefully use only
for operations that do not depend on hashing results for correctness
or safety. I personally believe that’s not a terribly good idea,
as it seems easy to misuse in a case where correctness actually does
matter, but maybe that’s just my Rust safety bias talking. I think
it would be better to focus on improving the performance of the safer
algorithm, as I think that many of the operations where the performance
penalty is the most painful are dealing with untrusted input anyway.

The `Hasher` struct gets a lot bigger; I don’t know if this is
an issue or not, but if it is, it could potentially be boxed.

Closes: GitoxideLabs#585
emilazy added a commit to emilazy/gitoxide that referenced this issue Mar 31, 2025
Fix [GHSA-2frx-2596-x5r6].

[GHSA-2frx-2596-x5r6]: GHSA-2frx-2596-x5r6

This uses the `sha1-checked` crate from the RustCrypto project. It’s
a pure Rust implementation, with no SIMD or assembly code.

Raw hashing is somewhere around 0.25× to 0.65× the speed of the
previous implementation, depending on the feature configuration
and whether the CPU supports hardware‐accelerated hashing. (The
more portable assembly in `sha1-asm` that doesn’t require the SHA
instruction set doesn’t seem to speed things up that much; in fact,
`sha1_smol` somehow regularly beats the assembly code used by `sha1`
on my i9‐9880H MacBook Pro! Presumably this is why that path was
removed in newer versions of the `sha1` crate.)

Performance on an end‐to‐end `gix no-repo pack verify` benchmark
using pack files from the Linux kernel Git server measures around
0.41× to 0.44× compared to the base commit on an M2 Max and a
Ryzen 7 5800X, both of which have hardware instructions for SHA‐1
acceleration that the previous implementation uses but this one does
not. On the i9‐9880H, it’s around 0.58× to 0.60× the speed;
the slowdown is reduced by the older hardware’s lack of SHA‐1
instructions.

The `sha1collisiondetection` crate from the Sequoia PGP project,
based on a modified C2Rust translation of the library used by Git,
was also considered; although its raw hashing performance seems
to measure around 1.12–1.15× the speed of `sha1-checked` on
x86, it’s indistinguishable from noise on the end‐to‐end
benchmark, and on an M2 Max `sha1-checked` is consistently
around 1.03× the speed of `sha1collisiondetection` on that
benchmark. The `sha1collisiondetection` crate has also had a
soundness issue in the past due to the automatic C translation,
whereas `sha1-checked` has only one trivial `unsafe` block. On the
other hand, `sha1collisiondetection` is used by both Sequoia itself
and the `gitoid` crate, whereas rPGP is the only major user of
`sha1-checked`. I don’t think there’s a clear winner here.

The performance regression is very unfortunate, but the [SHAttered]
attack demonstrated a collision back in 2017, and the 2020 [SHA‐1 is
a Shambles] attack demonstrated a practical chosen‐prefix collision
that broke the use of SHA‐1 in OpenPGP, costing $75k to perform,
with an estimate of $45k to replicate at the time of publication and
$11k for a classical collision.

[SHAttered]: https://shattered.io/
[SHA‐1 is a Shambles]: https://sha-mbles.github.io/

Given the increase in GPU performance and production since then,
that puts the Git object format squarely at risk. Git mitigated this
attack in 2017; the algorithm is fairly general and detects all the
existing public collisions. My understanding is that an entirely new
cryptanalytic approach would be required to develop a collision attack
for SHA‐1 that would not be detected with very high probability.

I believe that the speed penalty could be mitigated, although not
fully eliminated, by implementing a version of the hardened SHA‐1
function that makes use of SIMD. For instance, the assembly code used
by `openssl speed sha1` on my i9‐9880H measures around 830 MiB/s,
compared to the winning 580 MiB/s of `sha1_smol`; adding collision
detection support to that would surely incur a performance penalty,
but it is likely that it could be much more competitive with
the performance before this commit than the 310 MiB/s I get with
`sha1-checked`. I haven’t been able to find any existing work on
this; it seems that more or less everyone just uses the original
C library that Git does, presumably because nothing except Git and
OpenPGP is still relying on SHA‐1 anyway…

The performance will never compete with the >2 GiB/s that can
be achieved with the x86 SHA instruction set extension, as the
`SHA1RNDS4` instruction sadly runs four rounds at a time while the
collision detection algorithm requires checks after every round,
but I believe SIMD would still offer a significant improvement,
and the AArch64 extension seems like it may be more flexible.

I know that these days the Git codebase has an additional faster
unsafe API without these checks that it tries to carefully use only
for operations that do not depend on hashing results for correctness
or safety. I personally believe that’s not a terribly good idea,
as it seems easy to misuse in a case where correctness actually does
matter, but maybe that’s just my Rust safety bias talking. I think
it would be better to focus on improving the performance of the safer
algorithm, as I think that many of the operations where the performance
penalty is the most painful are dealing with untrusted input anyway.

The `Hasher` struct gets a lot bigger; I don’t know if this is
an issue or not, but if it is, it could potentially be boxed.

Closes: GitoxideLabs#585
emilazy added a commit to emilazy/gitoxide that referenced this issue Mar 31, 2025
Fix [GHSA-2frx-2596-x5r6].

[GHSA-2frx-2596-x5r6]: GHSA-2frx-2596-x5r6

This uses the `sha1-checked` crate from the RustCrypto project. It’s
a pure Rust implementation, with no SIMD or assembly code.

Raw hashing is somewhere around 0.25× to 0.65× the speed of the
previous implementation, depending on the feature configuration
and whether the CPU supports hardware‐accelerated hashing. (The
more portable assembly in `sha1-asm` that doesn’t require the SHA
instruction set doesn’t seem to speed things up that much; in fact,
`sha1_smol` somehow regularly beats the assembly code used by `sha1`
on my i9‐9880H MacBook Pro! Presumably this is why that path was
removed in newer versions of the `sha1` crate.)

Performance on an end‐to‐end `gix no-repo pack verify` benchmark
using pack files from the Linux kernel Git server measures around
0.41× to 0.44× compared to the base commit on an M2 Max and a
Ryzen 7 5800X, both of which have hardware instructions for SHA‐1
acceleration that the previous implementation uses but this one does
not. On the i9‐9880H, it’s around 0.58× to 0.60× the speed;
the slowdown is reduced by the older hardware’s lack of SHA‐1
instructions.

The `sha1collisiondetection` crate from the Sequoia PGP project,
based on a modified C2Rust translation of the library used by Git,
was also considered; although its raw hashing performance seems
to measure around 1.12–1.15× the speed of `sha1-checked` on
x86, it’s indistinguishable from noise on the end‐to‐end
benchmark, and on an M2 Max `sha1-checked` is consistently
around 1.03× the speed of `sha1collisiondetection` on that
benchmark. The `sha1collisiondetection` crate has also had a
soundness issue in the past due to the automatic C translation,
whereas `sha1-checked` has only one trivial `unsafe` block. On the
other hand, `sha1collisiondetection` is used by both Sequoia itself
and the `gitoid` crate, whereas rPGP is the only major user of
`sha1-checked`. I don’t think there’s a clear winner here.

The performance regression is very unfortunate, but the [SHAttered]
attack demonstrated a collision back in 2017, and the 2020 [SHA‐1 is
a Shambles] attack demonstrated a practical chosen‐prefix collision
that broke the use of SHA‐1 in OpenPGP, costing $75k to perform,
with an estimate of $45k to replicate at the time of publication and
$11k for a classical collision.

[SHAttered]: https://shattered.io/
[SHA‐1 is a Shambles]: https://sha-mbles.github.io/

Given the increase in GPU performance and production since then,
that puts the Git object format squarely at risk. Git mitigated this
attack in 2017; the algorithm is fairly general and detects all the
existing public collisions. My understanding is that an entirely new
cryptanalytic approach would be required to develop a collision attack
for SHA‐1 that would not be detected with very high probability.

I believe that the speed penalty could be mitigated, although not
fully eliminated, by implementing a version of the hardened SHA‐1
function that makes use of SIMD. For instance, the assembly code used
by `openssl speed sha1` on my i9‐9880H measures around 830 MiB/s,
compared to the winning 580 MiB/s of `sha1_smol`; adding collision
detection support to that would surely incur a performance penalty,
but it is likely that it could be much more competitive with
the performance before this commit than the 310 MiB/s I get with
`sha1-checked`. I haven’t been able to find any existing work on
this; it seems that more or less everyone just uses the original
C library that Git does, presumably because nothing except Git and
OpenPGP is still relying on SHA‐1 anyway…

The performance will never compete with the >2 GiB/s that can
be achieved with the x86 SHA instruction set extension, as the
`SHA1RNDS4` instruction sadly runs four rounds at a time while the
collision detection algorithm requires checks after every round,
but I believe SIMD would still offer a significant improvement,
and the AArch64 extension seems like it may be more flexible.

I know that these days the Git codebase has an additional faster
unsafe API without these checks that it tries to carefully use only
for operations that do not depend on hashing results for correctness
or safety. I personally believe that’s not a terribly good idea,
as it seems easy to misuse in a case where correctness actually does
matter, but maybe that’s just my Rust safety bias talking. I think
it would be better to focus on improving the performance of the safer
algorithm, as I think that many of the operations where the performance
penalty is the most painful are dealing with untrusted input anyway.

The `Hasher` struct gets a lot bigger; I don’t know if this is
an issue or not, but if it is, it could potentially be boxed.

Closes: GitoxideLabs#585
emilazy added a commit to emilazy/gitoxide that referenced this issue Mar 31, 2025
Fix [GHSA-2frx-2596-x5r6].

[GHSA-2frx-2596-x5r6]: GHSA-2frx-2596-x5r6

This uses the `sha1-checked` crate from the RustCrypto project. It’s
a pure Rust implementation, with no SIMD or assembly code.

Raw hashing is somewhere around 0.25× to 0.65× the speed of the
previous implementation, depending on the feature configuration
and whether the CPU supports hardware‐accelerated hashing. (The
more portable assembly in `sha1-asm` that doesn’t require the SHA
instruction set doesn’t seem to speed things up that much; in fact,
`sha1_smol` somehow regularly beats the assembly code used by `sha1`
on my i9‐9880H MacBook Pro! Presumably this is why that path was
removed in newer versions of the `sha1` crate.)

Performance on an end‐to‐end `gix no-repo pack verify` benchmark
using pack files from the Linux kernel Git server measures around
0.41× to 0.44× compared to the base commit on an M2 Max and a
Ryzen 7 5800X, both of which have hardware instructions for SHA‐1
acceleration that the previous implementation uses but this one does
not. On the i9‐9880H, it’s around 0.58× to 0.60× the speed;
the slowdown is reduced by the older hardware’s lack of SHA‐1
instructions.

The `sha1collisiondetection` crate from the Sequoia PGP project,
based on a modified C2Rust translation of the library used by Git,
was also considered; although its raw hashing performance seems
to measure around 1.12–1.15× the speed of `sha1-checked` on
x86, it’s indistinguishable from noise on the end‐to‐end
benchmark, and on an M2 Max `sha1-checked` is consistently
around 1.03× the speed of `sha1collisiondetection` on that
benchmark. The `sha1collisiondetection` crate has also had a
soundness issue in the past due to the automatic C translation,
whereas `sha1-checked` has only one trivial `unsafe` block. On the
other hand, `sha1collisiondetection` is used by both Sequoia itself
and the `gitoid` crate, whereas rPGP is the only major user of
`sha1-checked`. I don’t think there’s a clear winner here.

The performance regression is very unfortunate, but the [SHAttered]
attack demonstrated a collision back in 2017, and the 2020 [SHA‐1 is
a Shambles] attack demonstrated a practical chosen‐prefix collision
that broke the use of SHA‐1 in OpenPGP, costing $75k to perform,
with an estimate of $45k to replicate at the time of publication and
$11k for a classical collision.

[SHAttered]: https://shattered.io/
[SHA‐1 is a Shambles]: https://sha-mbles.github.io/

Given the increase in GPU performance and production since then,
that puts the Git object format squarely at risk. Git mitigated this
attack in 2017; the algorithm is fairly general and detects all the
existing public collisions. My understanding is that an entirely new
cryptanalytic approach would be required to develop a collision attack
for SHA‐1 that would not be detected with very high probability.

I believe that the speed penalty could be mitigated, although not
fully eliminated, by implementing a version of the hardened SHA‐1
function that makes use of SIMD. For instance, the assembly code used
by `openssl speed sha1` on my i9‐9880H measures around 830 MiB/s,
compared to the winning 580 MiB/s of `sha1_smol`; adding collision
detection support to that would surely incur a performance penalty,
but it is likely that it could be much more competitive with
the performance before this commit than the 310 MiB/s I get with
`sha1-checked`. I haven’t been able to find any existing work on
this; it seems that more or less everyone just uses the original
C library that Git does, presumably because nothing except Git and
OpenPGP is still relying on SHA‐1 anyway…

The performance will never compete with the >2 GiB/s that can
be achieved with the x86 SHA instruction set extension, as the
`SHA1RNDS4` instruction sadly runs four rounds at a time while the
collision detection algorithm requires checks after every round,
but I believe SIMD would still offer a significant improvement,
and the AArch64 extension seems like it may be more flexible.

I know that these days the Git codebase has an additional faster
unsafe API without these checks that it tries to carefully use only
for operations that do not depend on hashing results for correctness
or safety. I personally believe that’s not a terribly good idea,
as it seems easy to misuse in a case where correctness actually does
matter, but maybe that’s just my Rust safety bias talking. I think
it would be better to focus on improving the performance of the safer
algorithm, as I think that many of the operations where the performance
penalty is the most painful are dealing with untrusted input anyway.

The `Hasher` struct gets a lot bigger; I don’t know if this is
an issue or not, but if it is, it could potentially be boxed.

Closes: GitoxideLabs#585
emilazy added a commit to emilazy/gitoxide that referenced this issue Mar 31, 2025
Fix [GHSA-2frx-2596-x5r6].

[GHSA-2frx-2596-x5r6]: GHSA-2frx-2596-x5r6

This uses the `sha1-checked` crate from the RustCrypto project. It’s
a pure Rust implementation, with no SIMD or assembly code.

Raw hashing is somewhere around 0.25× to 0.65× the speed of the
previous implementation, depending on the feature configuration
and whether the CPU supports hardware‐accelerated hashing. (The
more portable assembly in `sha1-asm` that doesn’t require the SHA
instruction set doesn’t seem to speed things up that much; in fact,
`sha1_smol` somehow regularly beats the assembly code used by `sha1`
on my i9‐9880H MacBook Pro! Presumably this is why that path was
removed in newer versions of the `sha1` crate.)

Performance on an end‐to‐end `gix no-repo pack verify` benchmark
using pack files from the Linux kernel Git server measures around
0.41× to 0.44× compared to the base commit on an M2 Max and a
Ryzen 7 5800X, both of which have hardware instructions for SHA‐1
acceleration that the previous implementation uses but this one does
not. On the i9‐9880H, it’s around 0.58× to 0.60× the speed;
the slowdown is reduced by the older hardware’s lack of SHA‐1
instructions.

The `sha1collisiondetection` crate from the Sequoia PGP project,
based on a modified C2Rust translation of the library used by Git,
was also considered; although its raw hashing performance seems
to measure around 1.12–1.15× the speed of `sha1-checked` on
x86, it’s indistinguishable from noise on the end‐to‐end
benchmark, and on an M2 Max `sha1-checked` is consistently
around 1.03× the speed of `sha1collisiondetection` on that
benchmark. The `sha1collisiondetection` crate has also had a
soundness issue in the past due to the automatic C translation,
whereas `sha1-checked` has only one trivial `unsafe` block. On the
other hand, `sha1collisiondetection` is used by both Sequoia itself
and the `gitoid` crate, whereas rPGP is the only major user of
`sha1-checked`. I don’t think there’s a clear winner here.

The performance regression is very unfortunate, but the [SHAttered]
attack demonstrated a collision back in 2017, and the 2020 [SHA‐1 is
a Shambles] attack demonstrated a practical chosen‐prefix collision
that broke the use of SHA‐1 in OpenPGP, costing $75k to perform,
with an estimate of $45k to replicate at the time of publication and
$11k for a classical collision.

[SHAttered]: https://shattered.io/
[SHA‐1 is a Shambles]: https://sha-mbles.github.io/

Given the increase in GPU performance and production since then,
that puts the Git object format squarely at risk. Git mitigated this
attack in 2017; the algorithm is fairly general and detects all the
existing public collisions. My understanding is that an entirely new
cryptanalytic approach would be required to develop a collision attack
for SHA‐1 that would not be detected with very high probability.

I believe that the speed penalty could be mitigated, although not
fully eliminated, by implementing a version of the hardened SHA‐1
function that makes use of SIMD. For instance, the assembly code used
by `openssl speed sha1` on my i9‐9880H measures around 830 MiB/s,
compared to the winning 580 MiB/s of `sha1_smol`; adding collision
detection support to that would surely incur a performance penalty,
but it is likely that it could be much more competitive with
the performance before this commit than the 310 MiB/s I get with
`sha1-checked`. I haven’t been able to find any existing work on
this; it seems that more or less everyone just uses the original
C library that Git does, presumably because nothing except Git and
OpenPGP is still relying on SHA‐1 anyway…

The performance will never compete with the >2 GiB/s that can
be achieved with the x86 SHA instruction set extension, as the
`SHA1RNDS4` instruction sadly runs four rounds at a time while the
collision detection algorithm requires checks after every round,
but I believe SIMD would still offer a significant improvement,
and the AArch64 extension seems like it may be more flexible.

I know that these days the Git codebase has an additional faster
unsafe API without these checks that it tries to carefully use only
for operations that do not depend on hashing results for correctness
or safety. I personally believe that’s not a terribly good idea,
as it seems easy to misuse in a case where correctness actually does
matter, but maybe that’s just my Rust safety bias talking. I think
it would be better to focus on improving the performance of the safer
algorithm, as I think that many of the operations where the performance
penalty is the most painful are dealing with untrusted input anyway.

The `Hasher` struct gets a lot bigger; I don’t know if this is
an issue or not, but if it is, it could potentially be boxed.

Closes: GitoxideLabs#585
emilazy added a commit to emilazy/gitoxide that referenced this issue Mar 31, 2025
Fix [GHSA-2frx-2596-x5r6].

[GHSA-2frx-2596-x5r6]: GHSA-2frx-2596-x5r6

This uses the `sha1-checked` crate from the RustCrypto project. It’s
a pure Rust implementation, with no SIMD or assembly code.

Raw hashing is somewhere around 0.25× to 0.65× the speed of the
previous implementation, depending on the feature configuration
and whether the CPU supports hardware‐accelerated hashing. (The
more portable assembly in `sha1-asm` that doesn’t require the SHA
instruction set doesn’t seem to speed things up that much; in fact,
`sha1_smol` somehow regularly beats the assembly code used by `sha1`
on my i9‐9880H MacBook Pro! Presumably this is why that path was
removed in newer versions of the `sha1` crate.)

Performance on an end‐to‐end `gix no-repo pack verify` benchmark
using pack files from the Linux kernel Git server measures around
0.41× to 0.44× compared to the base commit on an M2 Max and a
Ryzen 7 5800X, both of which have hardware instructions for SHA‐1
acceleration that the previous implementation uses but this one does
not. On the i9‐9880H, it’s around 0.58× to 0.60× the speed;
the slowdown is reduced by the older hardware’s lack of SHA‐1
instructions.

The `sha1collisiondetection` crate from the Sequoia PGP project,
based on a modified C2Rust translation of the library used by Git,
was also considered; although its raw hashing performance seems
to measure around 1.12–1.15× the speed of `sha1-checked` on
x86, it’s indistinguishable from noise on the end‐to‐end
benchmark, and on an M2 Max `sha1-checked` is consistently
around 1.03× the speed of `sha1collisiondetection` on that
benchmark. The `sha1collisiondetection` crate has also had a
soundness issue in the past due to the automatic C translation,
whereas `sha1-checked` has only one trivial `unsafe` block. On the
other hand, `sha1collisiondetection` is used by both Sequoia itself
and the `gitoid` crate, whereas rPGP is the only major user of
`sha1-checked`. I don’t think there’s a clear winner here.

The performance regression is very unfortunate, but the [SHAttered]
attack demonstrated a collision back in 2017, and the 2020 [SHA‐1 is
a Shambles] attack demonstrated a practical chosen‐prefix collision
that broke the use of SHA‐1 in OpenPGP, costing $75k to perform,
with an estimate of $45k to replicate at the time of publication and
$11k for a classical collision.

[SHAttered]: https://shattered.io/
[SHA‐1 is a Shambles]: https://sha-mbles.github.io/

Given the increase in GPU performance and production since then,
that puts the Git object format squarely at risk. Git mitigated this
attack in 2017; the algorithm is fairly general and detects all the
existing public collisions. My understanding is that an entirely new
cryptanalytic approach would be required to develop a collision attack
for SHA‐1 that would not be detected with very high probability.

I believe that the speed penalty could be mitigated, although not
fully eliminated, by implementing a version of the hardened SHA‐1
function that makes use of SIMD. For instance, the assembly code used
by `openssl speed sha1` on my i9‐9880H measures around 830 MiB/s,
compared to the winning 580 MiB/s of `sha1_smol`; adding collision
detection support to that would surely incur a performance penalty,
but it is likely that it could be much more competitive with
the performance before this commit than the 310 MiB/s I get with
`sha1-checked`. I haven’t been able to find any existing work on
this; it seems that more or less everyone just uses the original
C library that Git does, presumably because nothing except Git and
OpenPGP is still relying on SHA‐1 anyway…

The performance will never compete with the >2 GiB/s that can
be achieved with the x86 SHA instruction set extension, as the
`SHA1RNDS4` instruction sadly runs four rounds at a time while the
collision detection algorithm requires checks after every round,
but I believe SIMD would still offer a significant improvement,
and the AArch64 extension seems like it may be more flexible.

I know that these days the Git codebase has an additional faster
unsafe API without these checks that it tries to carefully use only
for operations that do not depend on hashing results for correctness
or safety. I personally believe that’s not a terribly good idea,
as it seems easy to misuse in a case where correctness actually does
matter, but maybe that’s just my Rust safety bias talking. I think
it would be better to focus on improving the performance of the safer
algorithm, as I think that many of the operations where the performance
penalty is the most painful are dealing with untrusted input anyway.

The `Hasher` struct gets a lot bigger; I don’t know if this is
an issue or not, but if it is, it could potentially be boxed.

Closes: GitoxideLabs#585
emilazy added a commit to emilazy/gitoxide that referenced this issue Mar 31, 2025
Fix [GHSA-2frx-2596-x5r6].

[GHSA-2frx-2596-x5r6]: GHSA-2frx-2596-x5r6

This uses the `sha1-checked` crate from the RustCrypto project. It’s
a pure Rust implementation, with no SIMD or assembly code.

Raw hashing is somewhere around 0.25× to 0.65× the speed of the
previous implementation, depending on the feature configuration
and whether the CPU supports hardware‐accelerated hashing. (The
more portable assembly in `sha1-asm` that doesn’t require the SHA
instruction set doesn’t seem to speed things up that much; in fact,
`sha1_smol` somehow regularly beats the assembly code used by `sha1`
on my i9‐9880H MacBook Pro! Presumably this is why that path was
removed in newer versions of the `sha1` crate.)

Performance on an end‐to‐end `gix no-repo pack verify` benchmark
using pack files from the Linux kernel Git server measures around
0.41× to 0.44× compared to the base commit on an M2 Max and a
Ryzen 7 5800X, both of which have hardware instructions for SHA‐1
acceleration that the previous implementation uses but this one does
not. On the i9‐9880H, it’s around 0.58× to 0.60× the speed;
the slowdown is reduced by the older hardware’s lack of SHA‐1
instructions.

The `sha1collisiondetection` crate from the Sequoia PGP project,
based on a modified C2Rust translation of the library used by Git,
was also considered; although its raw hashing performance seems
to measure around 1.12–1.15× the speed of `sha1-checked` on
x86, it’s indistinguishable from noise on the end‐to‐end
benchmark, and on an M2 Max `sha1-checked` is consistently
around 1.03× the speed of `sha1collisiondetection` on that
benchmark. The `sha1collisiondetection` crate has also had a
soundness issue in the past due to the automatic C translation,
whereas `sha1-checked` has only one trivial `unsafe` block. On the
other hand, `sha1collisiondetection` is used by both Sequoia itself
and the `gitoid` crate, whereas rPGP is the only major user of
`sha1-checked`. I don’t think there’s a clear winner here.

The performance regression is very unfortunate, but the [SHAttered]
attack demonstrated a collision back in 2017, and the 2020 [SHA‐1 is
a Shambles] attack demonstrated a practical chosen‐prefix collision
that broke the use of SHA‐1 in OpenPGP, costing $75k to perform,
with an estimate of $45k to replicate at the time of publication and
$11k for a classical collision.

[SHAttered]: https://shattered.io/
[SHA‐1 is a Shambles]: https://sha-mbles.github.io/

Given the increase in GPU performance and production since then,
that puts the Git object format squarely at risk. Git mitigated this
attack in 2017; the algorithm is fairly general and detects all the
existing public collisions. My understanding is that an entirely new
cryptanalytic approach would be required to develop a collision attack
for SHA‐1 that would not be detected with very high probability.

I believe that the speed penalty could be mitigated, although not
fully eliminated, by implementing a version of the hardened SHA‐1
function that makes use of SIMD. For instance, the assembly code used
by `openssl speed sha1` on my i9‐9880H measures around 830 MiB/s,
compared to the winning 580 MiB/s of `sha1_smol`; adding collision
detection support to that would surely incur a performance penalty,
but it is likely that it could be much more competitive with
the performance before this commit than the 310 MiB/s I get with
`sha1-checked`. I haven’t been able to find any existing work on
this; it seems that more or less everyone just uses the original
C library that Git does, presumably because nothing except Git and
OpenPGP is still relying on SHA‐1 anyway…

The performance will never compete with the >2 GiB/s that can
be achieved with the x86 SHA instruction set extension, as the
`SHA1RNDS4` instruction sadly runs four rounds at a time while the
collision detection algorithm requires checks after every round,
but I believe SIMD would still offer a significant improvement,
and the AArch64 extension seems like it may be more flexible.

I know that these days the Git codebase has an additional faster
unsafe API without these checks that it tries to carefully use only
for operations that do not depend on hashing results for correctness
or safety. I personally believe that’s not a terribly good idea,
as it seems easy to misuse in a case where correctness actually does
matter, but maybe that’s just my Rust safety bias talking. I think
it would be better to focus on improving the performance of the safer
algorithm, as I think that many of the operations where the performance
penalty is the most painful are dealing with untrusted input anyway.

The `Hasher` struct gets a lot bigger; I don’t know if this is
an issue or not, but if it is, it could potentially be boxed.

Closes: GitoxideLabs#585
emilazy added a commit to emilazy/gitoxide that referenced this issue Mar 31, 2025
Fix [GHSA-2frx-2596-x5r6].

[GHSA-2frx-2596-x5r6]: GHSA-2frx-2596-x5r6

This uses the `sha1-checked` crate from the RustCrypto project. It’s
a pure Rust implementation, with no SIMD or assembly code.

Raw hashing is somewhere around 0.25× to 0.65× the speed of the
previous implementation, depending on the feature configuration
and whether the CPU supports hardware‐accelerated hashing. (The
more portable assembly in `sha1-asm` that doesn’t require the SHA
instruction set doesn’t seem to speed things up that much; in fact,
`sha1_smol` somehow regularly beats the assembly code used by `sha1`
on my i9‐9880H MacBook Pro! Presumably this is why that path was
removed in newer versions of the `sha1` crate.)

Performance on an end‐to‐end `gix no-repo pack verify` benchmark
using pack files from the Linux kernel Git server measures around
0.41× to 0.44× compared to the base commit on an M2 Max and a
Ryzen 7 5800X, both of which have hardware instructions for SHA‐1
acceleration that the previous implementation uses but this one does
not. On the i9‐9880H, it’s around 0.58× to 0.60× the speed;
the slowdown is reduced by the older hardware’s lack of SHA‐1
instructions.

The `sha1collisiondetection` crate from the Sequoia PGP project,
based on a modified C2Rust translation of the library used by Git,
was also considered; although its raw hashing performance seems
to measure around 1.12–1.15× the speed of `sha1-checked` on
x86, it’s indistinguishable from noise on the end‐to‐end
benchmark, and on an M2 Max `sha1-checked` is consistently
around 1.03× the speed of `sha1collisiondetection` on that
benchmark. The `sha1collisiondetection` crate has also had a
soundness issue in the past due to the automatic C translation,
whereas `sha1-checked` has only one trivial `unsafe` block. On the
other hand, `sha1collisiondetection` is used by both Sequoia itself
and the `gitoid` crate, whereas rPGP is the only major user of
`sha1-checked`. I don’t think there’s a clear winner here.

The performance regression is very unfortunate, but the [SHAttered]
attack demonstrated a collision back in 2017, and the 2020 [SHA‐1 is
a Shambles] attack demonstrated a practical chosen‐prefix collision
that broke the use of SHA‐1 in OpenPGP, costing $75k to perform,
with an estimate of $45k to replicate at the time of publication and
$11k for a classical collision.

[SHAttered]: https://shattered.io/
[SHA‐1 is a Shambles]: https://sha-mbles.github.io/

Given the increase in GPU performance and production since then,
that puts the Git object format squarely at risk. Git mitigated this
attack in 2017; the algorithm is fairly general and detects all the
existing public collisions. My understanding is that an entirely new
cryptanalytic approach would be required to develop a collision attack
for SHA‐1 that would not be detected with very high probability.

I believe that the speed penalty could be mitigated, although not
fully eliminated, by implementing a version of the hardened SHA‐1
function that makes use of SIMD. For instance, the assembly code used
by `openssl speed sha1` on my i9‐9880H measures around 830 MiB/s,
compared to the winning 580 MiB/s of `sha1_smol`; adding collision
detection support to that would surely incur a performance penalty,
but it is likely that it could be much more competitive with
the performance before this commit than the 310 MiB/s I get with
`sha1-checked`. I haven’t been able to find any existing work on
this; it seems that more or less everyone just uses the original
C library that Git does, presumably because nothing except Git and
OpenPGP is still relying on SHA‐1 anyway…

The performance will never compete with the >2 GiB/s that can
be achieved with the x86 SHA instruction set extension, as the
`SHA1RNDS4` instruction sadly runs four rounds at a time while the
collision detection algorithm requires checks after every round,
but I believe SIMD would still offer a significant improvement,
and the AArch64 extension seems like it may be more flexible.

I know that these days the Git codebase has an additional faster
unsafe API without these checks that it tries to carefully use only
for operations that do not depend on hashing results for correctness
or safety. I personally believe that’s not a terribly good idea,
as it seems easy to misuse in a case where correctness actually does
matter, but maybe that’s just my Rust safety bias talking. I think
it would be better to focus on improving the performance of the safer
algorithm, as I think that many of the operations where the performance
penalty is the most painful are dealing with untrusted input anyway.

The `Hasher` struct gets a lot bigger; I don’t know if this is
an issue or not, but if it is, it could potentially be boxed.

Closes: GitoxideLabs#585
emilazy added a commit to emilazy/gitoxide that referenced this issue Mar 31, 2025
Fix [GHSA-2frx-2596-x5r6].

[GHSA-2frx-2596-x5r6]: GHSA-2frx-2596-x5r6

This uses the `sha1-checked` crate from the RustCrypto project. It’s
a pure Rust implementation, with no SIMD or assembly code.

Raw hashing is somewhere around 0.25× to 0.65× the speed of the
previous implementation, depending on the feature configuration
and whether the CPU supports hardware‐accelerated hashing. (The
more portable assembly in `sha1-asm` that doesn’t require the SHA
instruction set doesn’t seem to speed things up that much; in fact,
`sha1_smol` somehow regularly beats the assembly code used by `sha1`
on my i9‐9880H MacBook Pro! Presumably this is why that path was
removed in newer versions of the `sha1` crate.)

Performance on an end‐to‐end `gix no-repo pack verify` benchmark
using pack files from the Linux kernel Git server measures around
0.41× to 0.44× compared to the base commit on an M2 Max and a
Ryzen 7 5800X, both of which have hardware instructions for SHA‐1
acceleration that the previous implementation uses but this one does
not. On the i9‐9880H, it’s around 0.58× to 0.60× the speed;
the slowdown is reduced by the older hardware’s lack of SHA‐1
instructions.

The `sha1collisiondetection` crate from the Sequoia PGP project,
based on a modified C2Rust translation of the library used by Git,
was also considered; although its raw hashing performance seems
to measure around 1.12–1.15× the speed of `sha1-checked` on
x86, it’s indistinguishable from noise on the end‐to‐end
benchmark, and on an M2 Max `sha1-checked` is consistently
around 1.03× the speed of `sha1collisiondetection` on that
benchmark. The `sha1collisiondetection` crate has also had a
soundness issue in the past due to the automatic C translation,
whereas `sha1-checked` has only one trivial `unsafe` block. On the
other hand, `sha1collisiondetection` is used by both Sequoia itself
and the `gitoid` crate, whereas rPGP is the only major user of
`sha1-checked`. I don’t think there’s a clear winner here.

The performance regression is very unfortunate, but the [SHAttered]
attack demonstrated a collision back in 2017, and the 2020 [SHA‐1 is
a Shambles] attack demonstrated a practical chosen‐prefix collision
that broke the use of SHA‐1 in OpenPGP, costing $75k to perform,
with an estimate of $45k to replicate at the time of publication and
$11k for a classical collision.

[SHAttered]: https://shattered.io/
[SHA‐1 is a Shambles]: https://sha-mbles.github.io/

Given the increase in GPU performance and production since then,
that puts the Git object format squarely at risk. Git mitigated this
attack in 2017; the algorithm is fairly general and detects all the
existing public collisions. My understanding is that an entirely new
cryptanalytic approach would be required to develop a collision attack
for SHA‐1 that would not be detected with very high probability.

I believe that the speed penalty could be mitigated, although not
fully eliminated, by implementing a version of the hardened SHA‐1
function that makes use of SIMD. For instance, the assembly code used
by `openssl speed sha1` on my i9‐9880H measures around 830 MiB/s,
compared to the winning 580 MiB/s of `sha1_smol`; adding collision
detection support to that would surely incur a performance penalty,
but it is likely that it could be much more competitive with
the performance before this commit than the 310 MiB/s I get with
`sha1-checked`. I haven’t been able to find any existing work on
this; it seems that more or less everyone just uses the original
C library that Git does, presumably because nothing except Git and
OpenPGP is still relying on SHA‐1 anyway…

The performance will never compete with the >2 GiB/s that can
be achieved with the x86 SHA instruction set extension, as the
`SHA1RNDS4` instruction sadly runs four rounds at a time while the
collision detection algorithm requires checks after every round,
but I believe SIMD would still offer a significant improvement,
and the AArch64 extension seems like it may be more flexible.

I know that these days the Git codebase has an additional faster
unsafe API without these checks that it tries to carefully use only
for operations that do not depend on hashing results for correctness
or safety. I personally believe that’s not a terribly good idea,
as it seems easy to misuse in a case where correctness actually does
matter, but maybe that’s just my Rust safety bias talking. I think
it would be better to focus on improving the performance of the safer
algorithm, as I think that many of the operations where the performance
penalty is the most painful are dealing with untrusted input anyway.

The `Hasher` struct gets a lot bigger; I don’t know if this is
an issue or not, but if it is, it could potentially be boxed.

Closes: GitoxideLabs#585
emilazy added a commit to emilazy/gitoxide that referenced this issue Mar 31, 2025
Fix [GHSA-2frx-2596-x5r6].

[GHSA-2frx-2596-x5r6]: GHSA-2frx-2596-x5r6

This uses the `sha1-checked` crate from the RustCrypto project. It’s
a pure Rust implementation, with no SIMD or assembly code.

Raw hashing is somewhere around 0.25× to 0.65× the speed of the
previous implementation, depending on the feature configuration
and whether the CPU supports hardware‐accelerated hashing. (The
more portable assembly in `sha1-asm` that doesn’t require the SHA
instruction set doesn’t seem to speed things up that much; in fact,
`sha1_smol` somehow regularly beats the assembly code used by `sha1`
on my i9‐9880H MacBook Pro! Presumably this is why that path was
removed in newer versions of the `sha1` crate.)

Performance on an end‐to‐end `gix no-repo pack verify` benchmark
using pack files from the Linux kernel Git server measures around
0.41× to 0.44× compared to the base commit on an M2 Max and a
Ryzen 7 5800X, both of which have hardware instructions for SHA‐1
acceleration that the previous implementation uses but this one does
not. On the i9‐9880H, it’s around 0.58× to 0.60× the speed;
the slowdown is reduced by the older hardware’s lack of SHA‐1
instructions.

The `sha1collisiondetection` crate from the Sequoia PGP project,
based on a modified C2Rust translation of the library used by Git,
was also considered; although its raw hashing performance seems
to measure around 1.12–1.15× the speed of `sha1-checked` on
x86, it’s indistinguishable from noise on the end‐to‐end
benchmark, and on an M2 Max `sha1-checked` is consistently
around 1.03× the speed of `sha1collisiondetection` on that
benchmark. The `sha1collisiondetection` crate has also had a
soundness issue in the past due to the automatic C translation,
whereas `sha1-checked` has only one trivial `unsafe` block. On the
other hand, `sha1collisiondetection` is used by both Sequoia itself
and the `gitoid` crate, whereas rPGP is the only major user of
`sha1-checked`. I don’t think there’s a clear winner here.

The performance regression is very unfortunate, but the [SHAttered]
attack demonstrated a collision back in 2017, and the 2020 [SHA‐1 is
a Shambles] attack demonstrated a practical chosen‐prefix collision
that broke the use of SHA‐1 in OpenPGP, costing $75k to perform,
with an estimate of $45k to replicate at the time of publication and
$11k for a classical collision.

[SHAttered]: https://shattered.io/
[SHA‐1 is a Shambles]: https://sha-mbles.github.io/

Given the increase in GPU performance and production since then,
that puts the Git object format squarely at risk. Git mitigated this
attack in 2017; the algorithm is fairly general and detects all the
existing public collisions. My understanding is that an entirely new
cryptanalytic approach would be required to develop a collision attack
for SHA‐1 that would not be detected with very high probability.

I believe that the speed penalty could be mitigated, although not
fully eliminated, by implementing a version of the hardened SHA‐1
function that makes use of SIMD. For instance, the assembly code used
by `openssl speed sha1` on my i9‐9880H measures around 830 MiB/s,
compared to the winning 580 MiB/s of `sha1_smol`; adding collision
detection support to that would surely incur a performance penalty,
but it is likely that it could be much more competitive with
the performance before this commit than the 310 MiB/s I get with
`sha1-checked`. I haven’t been able to find any existing work on
this; it seems that more or less everyone just uses the original
C library that Git does, presumably because nothing except Git and
OpenPGP is still relying on SHA‐1 anyway…

The performance will never compete with the >2 GiB/s that can
be achieved with the x86 SHA instruction set extension, as the
`SHA1RNDS4` instruction sadly runs four rounds at a time while the
collision detection algorithm requires checks after every round,
but I believe SIMD would still offer a significant improvement,
and the AArch64 extension seems like it may be more flexible.

I know that these days the Git codebase has an additional faster
unsafe API without these checks that it tries to carefully use only
for operations that do not depend on hashing results for correctness
or safety. I personally believe that’s not a terribly good idea,
as it seems easy to misuse in a case where correctness actually does
matter, but maybe that’s just my Rust safety bias talking. I think
it would be better to focus on improving the performance of the safer
algorithm, as I think that many of the operations where the performance
penalty is the most painful are dealing with untrusted input anyway.

The `Hasher` struct gets a lot bigger; I don’t know if this is
an issue or not, but if it is, it could potentially be boxed.

Closes: GitoxideLabs#585
emilazy added a commit to emilazy/gitoxide that referenced this issue Apr 2, 2025
Fix [GHSA-2frx-2596-x5r6].

[GHSA-2frx-2596-x5r6]: GHSA-2frx-2596-x5r6

This uses the `sha1-checked` crate from the RustCrypto project. It’s
a pure Rust implementation, with no SIMD or assembly code.

The hashing implementation moves to `gix-hash`, as it no longer
depends on any feature configuration. I wasn’t sure the ideal
crate to put this in, but after checking reverse dependencies on
crates.io, it seems like there’s essentially no user of `gix-hash`
that wouldn’t be pulling in a hashing implementation anyway, so I
think this is a fine and logical place for it to be.

A fallible API seems better than killing the process as Git does,
since we’re in a library context and it would be bad if you could
perform denial‐of‐service attacks on a server by sending it hash
collisions. (Although there are probably cheaper ways to mount a
denial‐of‐service attack.)

The new API also returns an `ObjectId` rather than `[u8; 20]`; the
vast majority of `Hasher::digest()` users immediately convert the
result to `ObjectId`, so this will help eliminate a lot of cruft
across the tree. `ObjectId` also has nicer `Debug` and `Display`
instances than `[u8; 20]`, and should theoretically make supporting
the hash function transition easier, although I suspect further API
changes will be required for that anyway. I wasn’t sure whether
this would be a good change, as not every digest identifies an
entry in the Git object database, but even many of the existing
uses for non‐object digests across the tree used the `ObjectId`
API anyway. Perhaps it would be best to have a separate non‐alias
`Digest` type that `ObjectId` wraps, but this seems like the pragmatic
choice for now that sticks with current practice.

The old API remains in this commit, as well as a temporary
non‐fallible but `ObjectId`‐returning `Hasher::finalize()`,
pending the migration of all in‐tree callers.

I named the module `gix_hash::hasher` since `gix_hash::hash` seemed
like it would be confusing. This does mean that there is a function
and module with the same name, which is permitted but perhaps a
little strange.

Everything is re‐exported directly other than
`gix_features::hash::Write`, which moves along with the I/O
convenience functions into a new public submodule and becomes
`gix_hash::hasher::io::Write`, as that seems like a clearer name
to me, being akin to the `gix_hash::hasher` function but as an
`std::io::Write` wrapper.

Raw hashing is somewhere around 0.25× to 0.65× the speed of the
previous implementation, depending on the feature configuration
and whether the CPU supports hardware‐accelerated hashing. (The
more portable assembly in `sha1-asm` that doesn’t require the SHA
instruction set doesn’t seem to speed things up that much; in fact,
`sha1_smol` somehow regularly beats the assembly code used by `sha1`
on my i9‐9880H MacBook Pro! Presumably this is why that path was
removed in newer versions of the `sha1` crate.)

Performance on an end‐to‐end `gix no-repo pack verify` benchmark
using pack files from the Linux kernel Git server measures around
0.41× to 0.44× compared to the base commit on an M2 Max and a
Ryzen 7 5800X, both of which have hardware instructions for SHA‐1
acceleration that the previous implementation uses but this one does
not. On the i9‐9880H, it’s around 0.58× to 0.60× the speed;
the slowdown is reduced by the older hardware’s lack of SHA‐1
instructions.

The `sha1collisiondetection` crate from the Sequoia PGP project,
based on a modified C2Rust translation of the library used by Git,
was also considered; although its raw hashing performance seems
to measure around 1.12–1.15× the speed of `sha1-checked` on
x86, it’s indistinguishable from noise on the end‐to‐end
benchmark, and on an M2 Max `sha1-checked` is consistently
around 1.03× the speed of `sha1collisiondetection` on that
benchmark. The `sha1collisiondetection` crate has also had a
soundness issue in the past due to the automatic C translation,
whereas `sha1-checked` has only one trivial `unsafe` block. On the
other hand, `sha1collisiondetection` is used by both Sequoia itself
and the `gitoid` crate, whereas rPGP is the only major user of
`sha1-checked`. I don’t think there’s a clear winner here.

The performance regression is very unfortunate, but the [SHAttered]
attack demonstrated a collision back in 2017, and the 2020 [SHA‐1 is
a Shambles] attack demonstrated a practical chosen‐prefix collision
that broke the use of SHA‐1 in OpenPGP, costing $75k to perform,
with an estimate of $45k to replicate at the time of publication and
$11k for a classical collision.

[SHAttered]: https://shattered.io/
[SHA‐1 is a Shambles]: https://sha-mbles.github.io/

Given the increase in GPU performance and production since then,
that puts the Git object format squarely at risk. Git mitigated this
attack in 2017; the algorithm is fairly general and detects all the
existing public collisions. My understanding is that an entirely new
cryptanalytic approach would be required to develop a collision attack
for SHA‐1 that would not be detected with very high probability.

I believe that the speed penalty could be mitigated, although not
fully eliminated, by implementing a version of the hardened SHA‐1
function that makes use of SIMD. For instance, the assembly code used
by `openssl speed sha1` on my i9‐9880H measures around 830 MiB/s,
compared to the winning 580 MiB/s of `sha1_smol`; adding collision
detection support to that would surely incur a performance penalty,
but it is likely that it could be much more competitive with
the performance before this commit than the 310 MiB/s I get with
`sha1-checked`. I haven’t been able to find any existing work on
this; it seems that more or less everyone just uses the original
C library that Git does, presumably because nothing except Git and
OpenPGP is still relying on SHA‐1 anyway…

The performance will never compete with the >2 GiB/s that can
be achieved with the x86 SHA instruction set extension, as the
`SHA1RNDS4` instruction sadly runs four rounds at a time while the
collision detection algorithm requires checks after every round,
but I believe SIMD would still offer a significant improvement,
and the AArch64 extension seems like it may be more flexible.

I know that these days the Git codebase has an additional faster
unsafe API without these checks that it tries to carefully use only
for operations that do not depend on hashing results for correctness
or safety. I personally believe that’s not a terribly good idea,
as it seems easy to misuse in a case where correctness actually does
matter, but maybe that’s just my Rust safety bias talking. I think
it would be better to focus on improving the performance of the safer
algorithm, as I think that many of the operations where the performance
penalty is the most painful are dealing with untrusted input anyway.

The `Hasher` struct gets a lot bigger; I don’t know if this is
an issue or not, but if it is, it could potentially be boxed.

Closes: GitoxideLabs#585
@Byron Byron closed this as completed in f253f02 Apr 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
acknowledged an issue is accepted as shortcoming to be fixed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants