Skip to content

Conversation

jyn514
Copy link
Member

@jyn514 jyn514 commented Jan 13, 2022

This makes cargo update much faster when there are many git deps in a repo and they're already
cached locally.

There was an existing check for this, but it didn't quite hit all cases.
The code looked like this:

        let (db, actual_rev) = match (self.locked_rev, db) {
            (Some(rev), Some(db)) if db.contains(rev) => (db, rev),
            (None, Some(db)) if self.config.offline() => {
                let rev = db.resolve(&self.manifest_reference)?;
                (db, rev)
            }
            (locked_rev, db) => self.remote.checkout(),
        };

The problem was that when locked_rev.is_none() && db.is_some() and --offline wasn't passed,
cargo would skip the db.resolve() check, and issue a fetch from the remote even if it wasn't
necessary.

This changes the db.resolve() check to happen even in online mode.
This brings cargo update down from 53 to 6 seconds for me on a codebase with 10+ git dependencies.

@rust-highfive
Copy link

r? @alexcrichton

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 13, 2022
@jyn514

This comment has been minimized.

@jyn514
Copy link
Member Author

jyn514 commented Jan 13, 2022

Oh interesting, there's already a check for this at https://github.com/rust-lang/cargo/blob/ba9f2f80dd439bf38e81dcf907c1c06cfe9bf371/src/cargo/sources/git/source.rs#L114-L123 - I think I can move that into fetch so that it's cached consistently.

@jyn514
Copy link
Member Author

jyn514 commented Jan 13, 2022

Ah wait no, cargo update already goes through that codepath, so it should be hitting that check already. The slowdown I was hitting is when locked_rev is None, because https://github.com/rust-lang/cargo/blob/ba9f2f80dd439bf38e81dcf907c1c06cfe9bf371/src/cargo/sources/git/source.rs#L128-L133 only gets hit when running in offline mode. If it checked db.resolve whenever db.is_some() I think it would fix my issue.

@alexcrichton
Copy link
Member

I am not following this super closely and it seems like this is rapidly changing. I wanted to say though that as a heads up the premise of this PR I think is flawed. A reference to something like a branch cannot check to see if there's a local copy since we need to check the remote to see if there's a more up-to-date version. The only case we can skip the remote update is if we have a locked git sha and that's already local. Just because we have refs/heads/main locally doesn't mean it's up-to-date.

@jyn514 jyn514 changed the title Don't fetch from the remote repo if the revision is already present locally cargo update: Don't fetch from the remote repo if the revision is already present locally Jan 13, 2022
@jyn514
Copy link
Member Author

jyn514 commented Jan 13, 2022

The only case we can skip the remote update is if we have a locked git sha and that's already local. Just because we have refs/heads/main locally doesn't mean it's up-to-date.

Ok, I can ensure this only happens when the revision is a locked sha and not a branch. The reason I opened this in the first place though is because cargo was updating the remote even when it's locked to a sha.

@jyn514
Copy link
Member Author

jyn514 commented Jan 13, 2022

I'm done changing this for now (except for addressing review comments) :)

…ocally

This makes `cargo update` *much* faster when there are many git deps in a repo and they're already
cached locally.

There was an existing check for this, but it didn't quite hit all cases.
The code looked like this:
```rust
        let (db, actual_rev) = match (self.locked_rev, db) {
            (Some(rev), Some(db)) if db.contains(rev) => (db, rev),
            (None, Some(db)) if self.config.offline() => {
                let rev = db.resolve(&self.manifest_reference)?;
                (db, rev)
            }
            (locked_rev, db) => self.remote.checkout(),
        };
```

The problem was that when `locked_rev.is_none() && db.is_some()` and `--offline` wasn't passed,
cargo would skip the `db.resolve()` check, and issue a fetch from the remote even if it wasn't
necessary.

This changes the `db.resolve()` check to happen even in online mode.
This brings `cargo update` down from 53 to 6 seconds for me on a codebase with 10+ git dependencies.
@weihanglo
Copy link
Member

A git revision can be a tag or sha or anything. I don't think we can simply search it in the local db. Especially when it cannot be parsed as a commit object.

For instance, the rev is a tag and chances are that the remote tag has been updated to point to another commit but our local database is still stale.

@weihanglo
Copy link
Member

@jyn514
Copy link
Member Author

jyn514 commented Feb 15, 2022

I found this doesn't actually fix my problem. Sorry for the noise.

@jyn514 jyn514 closed this Feb 15, 2022
@jyn514 jyn514 deleted the cached-git-deps branch February 15, 2022 04:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants