Skip to content

dvc ignore "!" syntax different behavior from git #5046

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dorp92 opened this issue Dec 7, 2020 · 10 comments · Fixed by #4986
Closed

dvc ignore "!" syntax different behavior from git #5046

dorp92 opened this issue Dec 7, 2020 · 10 comments · Fixed by #4986
Labels
bug Did we break something? p1-important Important, aka current backlog of things to do question I have a question?

Comments

@dorp92
Copy link

dorp92 commented Dec 7, 2020

From:
https://dvc.org/doc/command-reference/check-ignore

echo "file*\n\!file2" >> .dvcignore
touch file1 file2 other
dvc check-ignore file*
file1
file2

Why file2 is ignored?
Correct me if I'm worng but in git file2 shouldn't be ignored.

I am encountering this issue as a result of trying to ignore all files except certain extensions.
This works with git but not in dvc

*
!*.pth
!*.npy
!*/
.idea/

There is anther way to do it in dvc?

@shcheklein shcheklein transferred this issue from iterative/dvc.org Dec 7, 2020
@jorgeorpinel jorgeorpinel added the question I have a question? label Dec 7, 2020
@jorgeorpinel
Copy link
Contributor

It's a good question but I tried the same in .gitignore and git check-ignore has the same result: file2 is ignored.

Ref: https://git-scm.com/docs/gitignore

@dorp92
Copy link
Author

dorp92 commented Dec 7, 2020

Yes you are right about the output of check-ignore but still I think there is a bug.
I created .gitignore and .dvcignore files containing:

file*
!file2

Then

$ dvc add file2
Adding...                                                                                                     
ERROR: Path 'file2' is ignored by
.dvcignore:1:file*
.dvcignore:2:!file2

but git add file2 works fine.

@jorgeorpinel
Copy link
Contributor

Yes, in that case DVC behaves different to Git by design (I think cc @efiop). It gives the error do that the user knows what's going and can address the situation if needed.

What is the problem you are having, or what are you trying to achieve?

@skshetry skshetry added bug Did we break something? p1-important Important, aka current backlog of things to do labels Dec 9, 2020
@skshetry
Copy link
Collaborator

skshetry commented Dec 9, 2020

Looks like a bug to me. We had a discussion on #4103 which was fixed by @karajan1001.

It might be a mistake on DVC's part, and I wonder If it could be fixed by something similar to following (@karajan1001, please correct me if I am wrong):

diff --git a/dvc/ignore.py b/dvc/ignore.py
index 301e6669..49621fdd 100644
--- a/dvc/ignore.py
+++ b/dvc/ignore.py
@@ -129,7 +129,10 @@ class DvcIgnorePatterns(DvcIgnore):
                     result.append(pattern.file_info)
             else:
                 if regex.match(path):
-                    result.append(pattern.file_info)
+                    if ignore[1] is False:  # similarly for `is_dir`
+                        result = []
+                    else:
+                        result.append(pattern.file_info)
         return result

@dorp92
Copy link
Author

dorp92 commented Dec 9, 2020

Yes, in that case DVC behaves different to Git by design (I think cc @efiop). It gives the error do that the user knows what's going and can address the situation if needed.

What is the problem you are having, or what are you trying to achieve?

But it shouldn't give an error, Git just added the file.
With !file2 in gitignore file2 isn't ignored, But in dvc it is ignored.

My final goal is to ignore everything except of certain extensions

@karajan1001
Copy link
Contributor

echo "file*\n\!file2" >> .dvcignore
touch file1 file2 other
dvc check-ignore file*
file1
file2

is not a bug, the Git does the same thing.
Screenshot from 2020-12-09 20-05-21

$ dvc add file2
Adding...                                                                                                     
ERROR: Path 'file2' is ignored by
.dvcignore:1:file*
.dvcignore:2:!file2

This is a bug.

@karajan1001
Copy link
Contributor

@skshetry Seems that because of this
Screenshot from 2020-12-09 20-18-03

Here I add two prints into DvcIgnorePatterns.ignore and DvcIgnorePatterns.ignore_details. We can see that checking if file2 is ignored runs into DvcIgnorePatterns.ignore_details which is actually designed for dvc check-ignore it has a different behaviour fromDvcIgnorePatterns.ignore which is used for other funcs.

DvcIgnorePatterns.ignore return true for file matches ignore pattern ( starts without !) but DvcIgnorePatterns.ignore return true for file matches any pattern ( includes ones starts with ! )

Maybe we should rename DvcIgnorePatterns.ignore_details to something else or add some comments to prevent misusage.

@karajan1001
Copy link
Contributor

The problem is here.

check = stage.repo.tree.dvcignore.check_ignore(path)

@skshetry
Copy link
Collaborator

skshetry commented Dec 9, 2020

@karajan1001, thanks. So, we just have to change it to use is_ignored() instead in that file.

For other internal functions, for example, _ignored_details, a few comments should be enough. I just reached there by debugging.

I think check_ignore also should have a few docstring, that's enough I guess.

@karajan1001
Copy link
Contributor

It is OK in version 1.4.0.
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Did we break something? p1-important Important, aka current backlog of things to do question I have a question?
Projects
None yet
4 participants