-
Notifications
You must be signed in to change notification settings - Fork 1.2k
import: allow downloading regular files/dirs tracked by git #2889
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This is rebased on top of #2837 by danihodovic (for some reason I couldn't find his repo on the list of available forks when submitting the PR). |
dvc/dependency/repo.py
Outdated
to.checkout() | ||
except NoOutputInExternalRepoError as e: | ||
try: | ||
with self._make_repo( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what's an idiomatic way to avoid this duplication from fetch(), so I left it like this and I'm open to suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @Baranowski ! Thanks for the PR π We need to also test this case #2837 (comment) . Here is an example:
#!/bin/bash
set -x
set -e
rm -rf mytest
mkdir mytest
cd mytest
mkdir erepo
pushd erepo
git init
dvc init
dvc run -O foo 'echo foo > foo' # note that foo is not cached by dvc, hence the "-O"
git add .
git commit -m "init"
popd
mkdir repo
pushd repo
git init
dvc init
dvc import ../erepo foo
popd
Related dvc.org issue: iterative/dvc.org#835 |
@efiop, sure I will see if danihodovic writes the test soonish so that I can copy it for import. If not, I will write it myself. |
@Baranowski The tests would be different, so you could work async, I guess :) Looks like the only piece that you are reusing here is _copy_git_file, which is not that crucial. |
@efiop, added. The test passed without code modifications, somewhat to my surprise. |
@Baranowski Btw, please note that |
49707d4
to
0738afd
Compare
0738afd
to
7fb0a68
Compare
Thanks for pointing that out, @efiop. Rebased and updated. |
@Baranowski Fixed get implementation, please rebase. Sorry for the delay π |
7fb0a68
to
e137887
Compare
@efiop, this is rebased and ready for a review |
@Baranowski Please check the tests, they are failing on py2. |
to.checkout() | ||
try: | ||
if self._copy_if_git_file(to.fspath): | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note: Not setting to.info
as we do down below is fine, as git files are tiny and the hash will be computed later in the output itself.
9aa4c06
to
8985759
Compare
This reverts commit 4d59bcd.
dvc/dependency/repo.py
Outdated
def _git_status(self): | ||
cache_dir = self.repo.cache.local.cache_dir | ||
with self._make_repo( | ||
cache_dir=os.path.join(cache_dir, "old") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to pass cache dir, as you are not pulling anything anyway
cache_dir=os.path.join(cache_dir, "old") |
dvc/dependency/repo.py
Outdated
cache_dir=os.path.join(cache_dir, "old") | ||
) as old_repo: | ||
with self._make_repo( | ||
cache_dir=os.path.join(cache_dir, "new"), rev_lock=None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
cache_dir=os.path.join(cache_dir, "new"), rev_lock=None | |
rev_lock=None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without cache_dir
specified explicitly, both repos seem to be pulled into the same directory, and so the update to the imported file is never detected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Baranowski But cache_dir has nothing to do with the update, right? So I suppose the reason might be a bug in the caching system in dvc/external_repo.py?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, @efiop, I think I can see the bug in external_repo.py
. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Baranowski please see #2889 (comment) . Let's create a separate ticket.
tests/func/test_import.py
Outdated
assert os.path.isfile(str(tmp_dir / dst)) | ||
assert filecmp.cmp(str(erepo_dir / src), str(tmp_dir / dst), shallow=False) | ||
assert tmp_dir.scm.repo.git.check_ignore(str(tmp_dir / dst)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should only use str()
on Path-like objects for presentation purposes, to get str file path one should use fspath()
. Or fspath_py35()
when passing to an util working with Path-likes in Python 3.6+.
Also, Path
has .is_file()
and .is_dir()
. May also skip .exists()
since it is covered with is_file()
anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Thanks @Baranowski ! π |
This update reflects iterative/dvc#2889
Other than the
Apologies for raising this only after merging the PR. |
@Baranowski @efiop create issue just to keep track of those? |
This update reflects iterative/dvc#2889
β Have you followed the guidelines in the Contributing to DVC list?
π Check this box if this PR does not require documentation updates, or if it does and you have created a separate PR in dvc.org with such updates (or at least opened an issue about it in that repo). Please link below to your PR (or issue) in the dvc.org repo.
β Have you checked DeepSource, CodeClimate, and other sanity checks below? We consider their findings recommendatory and don't expect everything to be addressed. Please review them carefully and fix those that actually improve code or fix bugs.
Thank you for the contribution - we'll try to review it as soon as possible. π
Closes #2862