-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Use git/dvc APIs instead of actually checking out revisions #1688
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Could someone assign me to this as we agreed with @dmpetrov, please? |
I'm in the process of making a list of places in code which use the filesystem interface over the files checkouted by As discussed, I'm not going to make a huge refactoring/improvements while working on this issue, but it has to be a good idea to introduce a better git interface library like libgit2 or dulwich and use it to access git objects directly in all dvc interactions with git, instead of calling the git executables via GitPython wrapper and accessing the filesystem. Though no code changes will be made in this direction for now, it might be a good time to start the discussion of such refactoring. |
@ei-grad I've invited you as a collaborator to the project. I think I'll be able to assign you after you accept the invitation. |
Great! I even could assign myself by myself now :). |
So as of my current understanding - |
There are few DVC commands that accept
--all-branches
and--all-tags
options. Namely,dvc metrics show
,dvc gc
,dvc fetch
, etc. For all of them what we need is to being able to analyze content of DVC metafile across different Git revisions. Right now it's done by runninggit checkout
(and thendvc checkout
if we need to get content of the file from cache). This approach is fragile, depends on the current state of the working space (e.g there are uncommitted changes) and even dangerous.We should instead employ git API (like
ls-tree
, orls-files
?) and dvc API to get direct access to necessary files, directories, etc.Current implementation is here: https://github.com/iterative/dvc/blob/master/dvc/scm/base.py#L79 and is used in two places: https://github.com/iterative/dvc/blob/master/dvc/repo/__init__.py#L239 and https://github.com/iterative/dvc/blob/master/dvc/repo/metrics/show.py#L144.
Directly related issue: #1009
The text was updated successfully, but these errors were encountered: