-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Closed
Labels
A-gitArea: anything dealing with gitArea: anything dealing with gitC-bugCategory: bugCategory: bugCommand-updateS-acceptedStatus: Issue or feature is accepted, and has a team member available to help mentor or reviewStatus: Issue or feature is accepted, and has a team member available to help mentor or review
Description
Problem
Updating a Github dependency with a short commit id fails with such log:
error: Unable to update https://github.com/rust-lang/regex#837fd85
Caused by:
object not found - no match for id (837fd85000000000000000000000000000000000); class=Odb (9); code=NotFound (-3)
Steps
cargo new test-cargo-precise
cd test-cargo-precise
cargo add --git https://github.com/rust-lang/regex regex
CARGO_LOG=trace cargo update -p regex --precise 837fd85
Resulting cargo update
log:
0.000038417s TRACE cargo: start="2023-12-21T07:57:53.908041000Z"
0.000538542s TRACE cargo::util::config: get cv ConfigKey { env: "CARGO_ALIAS_UPDATE", parts: [("alias", 5), ("update", 11)] }
0.000570375s TRACE cargo::util::config: get cv ConfigKey { env: "CARGO_TERM", parts: [("term", 5)] }
0.000589250s TRACE cargo::util::config: get cv ConfigKey { env: "CARGO_NET", parts: [("net", 5)] }
0.001122417s TRACE cargo::util::config: get cv ConfigKey { env: "CARGO_HTTP", parts: [("http", 5)] }
0.001523083s TRACE cargo::util::config: get cv ConfigKey { env: "CARGO_BUILD", parts: [("build", 5)] }
0.001558208s TRACE cargo::util::toml: read_manifest; path=/Users/julienenoch/tmp/test-cargo-precise/Cargo.toml; source-id=/Users/julienenoch/tmp/test-cargo-precise
0.001857042s DEBUG cargo::core::workspace: find_members - only me as a member
0.002212917s TRACE cargo::util::config: get cv ConfigKey { env: "CARGO_PATCH", parts: [("patch", 5)] }
0.002268583s TRACE cargo::util::config: get cv ConfigKey { env: "CARGO_REGISTRY_INDEX", parts: [("registry", 5), ("index", 14)] }
0.002286292s TRACE cargo::util::config: get cv ConfigKey { env: "CARGO_REGISTRIES_CRATES_IO_PROTOCOL", parts: [("registries", 5), ("crates-io", 16), ("protocol", 26)] }
0.002298250s TRACE cargo::util::config: get cv ConfigKey { env: "CARGO_REGISTRIES_CRATES_IO_PROTOCOL", parts: [("registries", 5), ("crates-io", 16), ("protocol", 26)] }
0.002305875s TRACE cargo::util::config: get cv ConfigKey { env: "CARGO_REGISTRY_INDEX", parts: [("registry", 5), ("index", 14)] }
0.002316875s TRACE cargo::util::config: get cv ConfigKey { env: "CARGO_SOURCE", parts: [("source", 5)] }
0.002328917s DEBUG cargo::core::registry: load/missing https://github.com/rust-lang/regex#837fd85
0.002347417s DEBUG cargo::core::registry: loading source https://github.com/rust-lang/regex#837fd85
0.002353417s DEBUG cargo::sources::config: loading: https://github.com/rust-lang/regex#837fd85
0.002360000s TRACE cargo::core::source_id: loading SourceId; https://github.com/rust-lang/regex#837fd85
Updating git repository `https://github.com/rust-lang/regex`
0.009446625s TRACE cargo::sources::git::source: updating git source `GitRemote { url: Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("github.com")), port: None, path: "/rust-lang/regex", query: None, fragment: None } }`
0.010665208s DEBUG cargo::sources::git::utils: attempting GitHub fast path for https://github.com/api/repos/rust-lang/regex/commits/837fd85000000000000000000000000000000000
0.225184458s DEBUG cargo::sources::git::utils: skipping gc as there's only 2 pack files
0.225689625s DEBUG cargo::sources::git::utils: doing a fetch for https://github.com/rust-lang/regex
0.226630292s DEBUG cargo::sources::git::utils: initiating fetch of ["+refs/heads/*:refs/remotes/origin/*", "+HEAD:refs/remotes/origin/HEAD"] from https://github.com/rust-lang/regex
0.716648583s DEBUG cargo::sources::git::utils: attempting GitHub fast path for https://github.com/api/repos/rust-lang/regex/commits/837fd85000000000000000000000000000000000
0.739785417s DEBUG cargo::sources::git::utils: skipping gc as there's only 0 pack files
0.739872500s DEBUG cargo::sources::git::utils: doing a fetch for https://github.com/rust-lang/regex
0.740022625s DEBUG cargo::sources::git::utils: initiating fetch of ["+refs/heads/*:refs/remotes/origin/*", "+HEAD:refs/remotes/origin/HEAD"] from https://github.com/rust-lang/regex
2.332392833s DEBUG cargo: exit_with_error; err=CliError { error: Some(Unable to update https://github.com/rust-lang/regex#837fd85
Caused by:
object not found - no match for id (837fd85000000000000000000000000000000000); class=Odb (9); code=NotFound (-3)), exit_code: 101 }
2.332410958s DEBUG cargo: display_error; err=Unable to update https://github.com/rust-lang/regex#837fd85
Caused by:
object not found - no match for id (837fd85000000000000000000000000000000000); class=Odb (9); code=NotFound (-3)
error: Unable to update https://github.com/rust-lang/regex#837fd85
Caused by:
object not found - no match for id (837fd85000000000000000000000000000000000); class=Odb (9); code=NotFound (-3)
Possible Solution(s)
No response
Notes
This Github API URL used by cargo
returns an error:
https://github.com/api/repos/rust-lang/regex/commits/837fd85000000000000000000000000000000000
While the same but with short commit id succeeds:
https://github.com/api/repos/rust-lang/regex/commits/837fd85
What's strange is that I successfully used such cargo update -p <dep> --precise <short_id>
few weeks ago.
Could it be that Github changed its API ?
Version
cargo 1.74.1 (ecb9851af 2023-10-18)
release: 1.74.1
commit-hash: ecb9851afd3095e988daaa35a48bc7f3cb748e04
commit-date: 2023-10-18
host: aarch64-apple-darwin
libgit2: 1.7.1 (sys:0.18.0 vendored)
libcurl: 8.1.2 (sys:0.4.68+curl-8.4.0 system ssl:(SecureTransport) LibreSSL/3.3.6)
ssl: OpenSSL 1.1.1u 30 May 2023
os: Mac OS 14.1.2 [64-bit]
--
Also same issue with:
cargo 1.71.0 (cfd3bbd8f 2023-06-08)
release: 1.71.0
commit-hash: cfd3bbd8fe4fd92074dfad04b7eb9a923646839f
commit-date: 2023-06-08
host: aarch64-apple-darwin
libgit2: 1.6.4 (sys:0.17.1 vendored)
libcurl: 8.1.2 (sys:0.4.61+curl-8.0.1 system ssl:(SecureTransport) LibreSSL/3.3.6)
ssl: OpenSSL 1.1.1t 7 Feb 2023
os: Mac OS 14.1.2 [64-bit]
Metadata
Metadata
Assignees
Labels
A-gitArea: anything dealing with gitArea: anything dealing with gitC-bugCategory: bugCategory: bugCommand-updateS-acceptedStatus: Issue or feature is accepted, and has a team member available to help mentor or reviewStatus: Issue or feature is accepted, and has a team member available to help mentor or review
Type
Projects
Milestone
Relationships
Development
Select code repository
Activity
weihanglo commentedon Dec 21, 2023
Got an impression that @Eh2406 just talked about this somewhere. @Eh2406 do you remember anything?
JEnoch commentedon Dec 22, 2023
Digging into the code, I see this completion with
0
s of short commit id comes from usage ofgit2::Oid::from_str(s)
inSourceId::precise_git_oid()
.It somehow ends up to be used in
github_fast_path()
, probably after conversion fromgit2::Oid
toGitReference::Rev(String)
. Possibly this one ?Could the regression come from #12849 ?
But that wouldn't explain why the issue also occurs with cargo 1.71.0 that was release before this PR.
weihanglo commentedon Dec 22, 2023
Good finding! Thank you.
I've tested several toolchains. Starting from 1.46.0 Cargo contains this bug, which contains the PR #8363 that you observed the potential bad commit. The Cargo started using
Oid::from_str
to parseprecise
field since then:cargo/src/cargo/sources/git/source.rs
Lines 35 to 37 in ddc2799
The zero-padding behavior is a documented behavior in libgit2 like from the beginning, so not a dependency issue.
It might or might not be GitHub has changed the API. I cannot tell, as Cargo had no such a test exercising
--precise <SHA>
until recently. And that doesn't even test against short ID.Feel like
SourceId::from_url
has an assumption thatprecise
for Git source is always a valid long-enough Git SHA (the format is opaque to cargo users btw). One fix I can think of is comparing if the value of--precise
is a hex string of 40/64 characters here, and give a nice error if not. This is only a fix on the surface, if we find more commands have this kind of bug, we might want to push the fix down to lower-level.linyihai commentedon Dec 25, 2023
I will provide a PR to fix it based on this
This is worth a look, I will check the relevant code by the way, once there is any feasible plan I will make a corresponding plan
@rustbot claim
JEnoch commentedon Dec 27, 2023
If possible I would like to still have the possibility to use
--precise
with short SHA.As a use case in Zenoh: there are some plugins (shared libraries) for which the dependency to the
zenoh
crate must be the exact same commit than the hosting binary (to avoid ABI incompatibility). The short commit id can be retrieved via--version
that return the result ofgit describe
.Using this short commit id in
--precise
would avoid the user to search for the long commit id.I think a solution could be the following:
github_fast_path()
could trim the extra0
s from theGitReference::Rev(rev)
before queryinghttps://github.com/api/repos/{username}/{repository}/commits/{rev}
.Thus the query will succeed with short SHA. The long SHA is returned in
FastPathRev::NeedsFetch(oid_to_fetch)
.fetch()
operation could return thisoid_to_fetch
returned bygithub_fast_path()
as anOption<Oid>
checkout()
, iffetch()
returnsSome(rev)
thisrev
is the long SHA and should be tested if contained indb
I tested such a patch here, and it seems to work.
What do you think?
weihanglo commentedon Dec 27, 2023
Thanks for looking into the issue! The effort is much appreciated :)
However, there are probably two things we need to take care of:
locked_rev
is not desired, as the new rev might not be the same as the old one, which we lose the meaning of "locked". Git short hashes can also be generated deliberately, and it doesn't seem too safe to trim and query over network.That said, it doesn't mean we cannot support short hashes. We need to find a reliable way to pass down the value from
--precise
, and get it resolved to a full SHA. If the solution is not yet ready, for now we can give a nice message of "short hash not supported".JEnoch commentedon Dec 28, 2023
Understood, that makes sense.
I think the root of the issue is that the conversion from short SHA to
git2::Oid
leads to a 20 bytes SHA completed with0
s that cannot be found in thegit2::Repository
via call torevparse_single()
.I made some tests with git2:
revparse_single()
accepts short SHA and returns the full SHA. It detects short SHA collisions, returning such this error message: "ambiguous OID prefix - found multiple offsets for pack entry". So I think it's safe to use it to resolve a short SHA into a full SHA before storing it aslocked_rev
.Actually,
GitDatabase::resolve(&self, r: &GitReference)
uses it and will return agit2::Oid
containing the full SHA.This call to
GitDatabase::resolve()
for full SHA retrieval could be done inGitSource::new()
as a replacement tosource_id.precise_git_oid()
call. But this requires to create theGitDatabase
here, while it's only created inGitSource::block_until_ready()
so far. Not sure if it's a problem.An alternative is to retrieve the full SHA in
GitSource::block_until_ready()
, just afterGitDatabase
creation. No need to replaceself.locked_rev
, but just to use the retrieved full SHA instead in the followingmatch (self.locked_rev, db)
weihanglo commentedon Jan 4, 2024
Yeah. Totally agree on the root cause.
There seems to be an “implicit assumption” that a git
SourceId
has aprecise
field of a full SHA.cargo update
is an exception that makes use of--precise
to have a transientprecise
value presenting a arbitrary git revision. It would eventually become a full SHA when the revision is resolved.The proposed solutions aren't wrong, though I would like to see
locked_rev
to be a type that represent a "revision", so we could have either full SHA (oid), or an arbitrary git revision. That would also make the behavior aligned to the doc:BTW, I accidentally have a working patch of this during code exploration. I may post a PR later but feel free to propose more better solutions :)
--precise
accept arbitrary git revisions #13250linyihai commentedon Jan 5, 2024
Sorry, I haven't started this work yet. It's nice to see a better solution being proposed. 👍
weihanglo commentedon Jan 5, 2024
Oops. I didn't realize this is claimed. Sorry for that, @linyihai 😔
Actually I was trying to provide guidance but ended up writing a fix
linyihai commentedon Jan 5, 2024
The content to be changed is already very different from what I have been exposed to before, So feel free to continue it. Your PR is also a good guidance for me.
JEnoch commentedon Jan 5, 2024
You PR is more advanced than my proposal. I agree that having a
Revision
enum that can be aGitReference
to be resolved when required is more accurately reflecting reality. Nice job! 😃