Description
Problem
GitHub recently enabled support for fetching individual commits by commit hash (uploadpack.allowReachableSHA1InWant
on the server side).
$ GIT_TRACE_PACKET=1 git fetch | head -1
17:49:50.750968 pkt-line.c:80 packet: fetch< 458d3459cfb0f923fbe968e6868a4893af34ba69 HEAD\0multi_ack thin-pack side-band side-band-64k ofs-delta shallow deepen-since deepen-not deepen-relative no-progress include-tag multi_ack_detailed allow-tip-sha1-in-want allow-reachable-sha1-in-want symref=HEAD:refs/heads/master filter object-format=sha1 agent=git/github-g6409641ef0c2
Notice the presence of allow-reachable-sha1-in-want
in the advertised protocol capabilities.
Example of using "reachable sha1 in want" on the CLI:
$ mkdir cargo
$ cd cargo
$ git fetch --depth=1 https://github.com/rust-lang/cargo 88117505b8b691e0e7892630a71a85bb5e9945de
remote: Enumerating objects: 782, done.
remote: Counting objects: 100% (782/782), done.
remote: Compressing objects: 100% (685/685), done.
remote: Total 782 (delta 136), reused 316 (delta 65), pack-reused 0
Receiving objects: 100% (782/782), 2.08 MiB | 3.34 MiB/s, done.
Resolving deltas: 100% (136/136), done.
From https://github.com/rust-lang/cargo
* branch 88117505b8b691e0e7892630a71a85bb5e9945de -> FETCH_HEAD
$ git log -1 88117505b8b691e0e7892630a71a85bb5e9945de
commit 88117505b8b691e0e7892630a71a85bb5e9945de (grafted)
Author: Eric Huss <[email protected]>
Date: Fri Oct 22 07:53:17 2021 -0700
Bump to 0.59.0
Notice in the above log that only 2.08 MiB total were downloaded. This is significantly less than the 50+ MiB of the whole Cargo repo. For larger repos the difference can be even more significant.
Unfortunately today Cargo doesn't make use of "reachable sha1 in want". Instead, when you specify a git dependency like cargo = { git = "https://github.com/rust-lang/cargo", rev = "88117505b8b691e0e7892630a71a85bb5e9945de" }
, Cargo fetches all branches and tags and their entire history, hoping that the requested commit id is somewhere among that potentially enormous pile of commits:
cargo/src/cargo/sources/git/utils.rs
Lines 809 to 814 in 458d345
Proposed Solution
I've opened a PR containing the Cargo side of the implementation:
However libgit2, which is the C library wrapped by Cargo's git2
dependency, does not yet support "reachable sha1 in want" as far as I can tell (the git
cli does, which is not based on libgit2, and is why the git fetch
above is able to use it).
Someone will need to send a PR to libgit2 implementing "reachable sha1 in want", then pull the changes into https://github.com/rust-lang/git2-rs, and finally land the Cargo change to use it.
Notes
As a side benefit, this change will make rev = "..."
dependencies support revs which are not in the history of any upstream branch or tag. But the performance or disk usage improvement will be the more noticeable benefit to most Cargo users.