Skip to content

Conversation

ColinKinloch
Copy link
Contributor

@ColinKinloch ColinKinloch commented Jan 24, 2025

This branch adds crate::clone_git_repository to clone git repos generally and Job::clone_git_repository to clone specifically defined by the gitlab job.

clone_git_repository is heavily based on PrepareFetch::fetch_only

Since the gitoxide PrepareFetch and PrepareCheckout structs delete the target path when they are dropped I opted to checkout the repository to a temporary directory, then move it's contents to the path provided to the target directory.

This adds a rather unwieldy GitCheckoutError enum. gitoxides Error types triggers the result_large_err clippy warning.

This requires rust version 1.73.0 or later.

Fixes #21

@ColinKinloch ColinKinloch force-pushed the git branch 2 times, most recently from 587d8dd to ff98e93 Compare January 28, 2025 10:58
@ColinKinloch ColinKinloch force-pushed the git branch 4 times, most recently from 082f212 to 7428588 Compare January 28, 2025 17:03
@ColinKinloch ColinKinloch force-pushed the git branch 3 times, most recently from dbcc18f to c0f9c71 Compare February 10, 2025 18:02
@ColinKinloch ColinKinloch marked this pull request as ready for review February 10, 2025 18:03
Copy link

@andrewshadura andrewshadura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks okay, but I haven’t gone deeply into details. I trust you tested this code well? (It would be great if it also included some automated tests.)

}

// TODO: Should clone_git_repository be async
// gitoxide doesn't currently have async implemented for http backend
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is viable without either async or cloning in a separate thread (which would make cancellation really tricky, though you seem to have some interrupt-related stuff wired up already); blocking an entire I/O thread for what might be a very long clone process is not workable. Should this maybe just be using libgit2 or something else instead?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw i'd prefer to stay with gitoxide to avoid more C dependences in the core of gitlab-runner. But yes it should be done in a dedicated thread if there is no async api

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated it to use tokio::task::spawn_blocking which runs it on a different thread.
I used a CancellationToken and tokio::select to trigger gitoxide to interrupt, CancellationToken is already used in the gitlab-runner-rs API so hopefully is okay.
gitoxide re-exports Progress from the prodash crate to allow the caller to monitor progress. Is that something we should do?

* * Clones to BuildDir
* * Checks out git_info.sha and fetches all in git_info.refspecs
* * Hardcoded username "gitlab-ci-token"
* * credHelperCommand?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a bit tricky to understand out-of-context, might want to add a few more details for some of these points

}

let checkout_progress = progress.add_child("checkout".to_string());
let (checkout, _outcome) = fetch.fetch_then_checkout(checkout_progress, &should_interrupt)?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: if the goal is just to be able to read files from the repo, do we need to do an entire checkout to begin with? couldn't it clone a bare repo and then load files as-needed from there? though the apis might be harder to make for that...but it would likely also be a lot faster.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fwiw i wouldn't mind this being a potential future update ; I'm not sure how much faster it would be ("likely be a lot of faster" is always a dangerous statement in the land of optimisation). I guess it primarily depend on what work would be saved, which i guess is primarily unpacking the pack file.

Regardless; the time would be comparable to a normal checking in a normal gitlab runner (or faster as iirc gix is faster at this point then git).

@ColinKinloch ColinKinloch force-pushed the git branch 4 times, most recently from 246508e to 6b04326 Compare July 25, 2025 20:37
* `clone_git_repository` to clone arbitrary repo
* `Job::clone_git_repository` to clone the jobs repo
Also throw an error if the provided HEAD ref cannot be parsed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Should support checking out the git repository
4 participants