-
Notifications
You must be signed in to change notification settings - Fork 397
Docs for new command dvc check-ignore
#1629
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
702172e
214131d
20cbcf5
ead5f4f
3b5e1a0
2fe85d3
98add0a
feaa074
545b263
34fdb5f
44277dc
1288407
d99e05c
3ae42a8
c61c42a
f10806b
6d3930d
a9a5242
1c32b8d
524b01e
c759585
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -48,6 +48,7 @@ module.exports = [ | |
'config', | ||
'commit', | ||
'checkout', | ||
'check-ignore', | ||
'cache dir', | ||
'cache', | ||
'add' | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,90 @@ | ||
# check-ignore | ||
|
||
Check whether any given files or directories are excluded from DVC due to the | ||
patterns found in [`.dvcignore`](/doc/user-guide/dvcignore). | ||
|
||
## Synopsis | ||
|
||
```usage | ||
usage: usage: dvc check-ignore [-h] [-q | -v] [-d] [-n] | ||
targets [targets ...] | ||
|
||
positional arguments: | ||
targets File or directory paths to check (wildcards supported) | ||
``` | ||
|
||
## Description | ||
|
||
This helper command checks whether the given `targets` are ignored by DVC | ||
according to the [`.dvcignore` file](/doc/user-guide/dvcignore) (if any). The | ||
ones that are ignored indeed are printed back. | ||
|
||
> Note that your shell may support path wildcards such as `dir/file*` and these | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. same - it's fine to keep it here, but that what I would expect anyways since multiple targets are supported. A bit not consistent since we don't make notes like this in other commands. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh I see you found my note. Yeah any other commands where this note would be especially relevant? add/commit/remove? status/repro? fetch/pull/push? metrics/plots? (un)freeze? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess best to just remove it too... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also removed in #1673 |
||
> can be fed as `targets` to `dvc check-ignore`, as shown in the | ||
> [examples](#examples). | ||
|
||
## Options | ||
|
||
- `-d`, `--details` - show the exclude pattern together with each target path. | ||
|
||
- `-n`, `--non-matching` - show the target paths which donβt match any pattern. | ||
Only usable when `--details` is also employed | ||
|
||
- `-h`, `--help` - prints the usage/help message, and exit. | ||
|
||
- `-q`, `--quiet` - do not write anything to standard output. Exit with 0 if no | ||
problems arise, otherwise 1. | ||
|
||
- `-v`, `--verbose` - displays detailed tracing information. | ||
|
||
## Examples | ||
|
||
First, let's create a `.dvcignore` file with some patterns in it, and some files | ||
to check against it. | ||
|
||
```dvc | ||
$ echo "file*\n\!file2" >> .dvcignore | ||
$ cat .dvcignore | ||
file* | ||
!file2 | ||
$ touch file1 file2 other | ||
$ ls | ||
file1 file2 other | ||
jorgeorpinel marked this conversation as resolved.
Show resolved
Hide resolved
|
||
``` | ||
|
||
Then, let's use `dvc check-ignore` to see which of these files would be excluded | ||
given our `.dvcignore` file: | ||
|
||
```dvc | ||
$ dvc check-ignore file1 | ||
file1 | ||
$ dvc check-ignore file1 file2 | ||
file1 | ||
file2 | ||
$ dvc check-ignore other | ||
jorgeorpinel marked this conversation as resolved.
Show resolved
Hide resolved
|
||
# There's no command output, meaning `other` is not excluded. | ||
$ dvc check-ignore file* | ||
file1 | ||
file2 | ||
``` | ||
|
||
If the `--details` option is used, a series of lines are printed using this | ||
format: `<path/to/.dvcignore>:<line_num>:<pattern> | <target_path>` | ||
jorgeorpinel marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
```dvc | ||
$ dvc check-ignore -d file1 file2 | ||
.dvcignore:1:file* file1 | ||
.dvcignore:2:!file2 file2 | ||
$ dvc check-ignore -d other | ||
$ dvc check-ignore -d file* | ||
.dvcignore:1:file* file1 | ||
.dvcignore:2:!file2 file2 | ||
``` | ||
|
||
With the `--non-matching` option, non-matching `targets` will also be included | ||
in the list. All fields in each line, except for `<target path>`, will be empty. | ||
|
||
```dvc | ||
$ dvc check-ignore -d -n other | ||
:: other | ||
``` |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8,16 +8,17 @@ project. For example, when working in a <abbr>workspace</abbr> directory with a | |
large number of data files, you might encounter extended execution time for | ||
operations as simple as `dvc status`. In other case you might want to omit files | ||
or folders unrelated to the project (like `.DS_Store` on MacOS). To address | ||
these scenarios, DVC supports optional `.dvcignore` files. `.dvcignore` works | ||
similar to `.gitignore` in Git. | ||
these scenarios, DVC supports optional `.dvcignore` files. | ||
|
||
`.dvcignore` is similar to `.gitignore` in Git, and can be tested with our | ||
helper command `dvc check-ignore`. | ||
|
||
## How does it work? | ||
|
||
- You need to create the `.dvcignore` file. It can be placed in the root of the | ||
project or inside any subdirectory (see also [remarks](#Remarks) below). | ||
- Populate it with [patterns](https://git-scm.com/docs/gitignore) that you would | ||
like to ignore. You can find useful templates | ||
[here](https://github.com/github/gitignore). | ||
- You need to create a `.dvcignore` file. These can be placed in the root of the | ||
project, or in any subdirectory (see the [remarks](#Remarks) below). | ||
- Populate it with [.gitignore patterns](https://git-scm.com/docs/gitignore). | ||
You can find useful templates [here](https://github.com/github/gitignore). | ||
- Each line should contain only one pattern. | ||
- During execution of commands that traverse directories, DVC will ignore | ||
matching paths. | ||
|
@@ -28,87 +29,95 @@ Ignored files will not be saved in <abbr>cache</abbr>, they will be non-existent | |
for DVC. It's worth to remember that, especially when ignoring files inside | ||
DVC-handled directories. | ||
|
||
**It is crucial to understand, that DVC might remove ignored files upon | ||
`dvc run` or `dvc repro`. If they are not produced by a | ||
[pipeline](/doc/command-reference/dag) [stage](/doc/command-reference/run), they | ||
can be deleted permanently.** | ||
β οΈ Important! Note that `dvc run` and `dvc repro` might remove ignored files. If | ||
they are not produced by a pipeline [stage](/doc/command-reference/run), they | ||
can be lost permanently. | ||
|
||
Keep in mind, that when you add `.dvcignore` patterns that affect an existing | ||
<abbr>output</abbr>, its status will change and DVC will behave as if that | ||
affected files were deleted. | ||
|
||
Keep in mind, that when you add to `.dvcignore` entries that affect one of the | ||
existing <abbr>outputs</abbr>, its status will change and DVC will behave as if | ||
that affected files were deleted. | ||
π‘ Note that you can use the `dvc check-ignore` command to check whether given | ||
files or directories are ignored by the patterns in a `.dvcignore` file. | ||
|
||
If DVC finds a `.dvcignore` file inside a dependency or output directory, it | ||
raises an error. Ignoring files inside such directories should be handled from a | ||
`.dvcignore` in higher levels of the project tree. | ||
|
||
## Syntax | ||
|
||
The same as for [`.gitignore`](https://git-scm.com/docs/gitignore). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think that was useful. Also, we do quite a bit of changes not related directly to the command, @jorgeorpinel :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, sorry. But had to read the whole thing to find the best places to link to the command and did copy edits along the way. This section was repetitive since the format (and templates) is already linked to from the bullets above. It was also an entire H2 section for 1-line β looked funny once rendered. Should I restore it and move all the syntax/format and templates info here? (remove it from the bullet under How it works) |
||
|
||
## Examples | ||
|
||
Let's see what happens when we add a file to `.dvcignore`. | ||
Let's see what happens when we add a file to `.dvcignore`: | ||
|
||
```dvc | ||
$ mkdir data | ||
$ echo data1 >> data/data1 | ||
$ echo data2 >> data/data2 | ||
$ tree . | ||
|
||
$ echo 1 > data/data1 | ||
$ echo 2 > data/data2 | ||
$ tree | ||
. | ||
βββ data | ||
βββ data1 | ||
βββ data2 | ||
``` | ||
|
||
We created the `data/` directory with two files. Let's ignore one of them, and | ||
track the directory with DVC. | ||
We created the `data/` directory with two data files. Let's ignore one of them, | ||
and double check that it's being ignored by DVC: | ||
|
||
```dvc | ||
$ echo data/data1 >> .dvcignore | ||
$ cat .dvcignore | ||
|
||
data/data1 | ||
$ dvc check-ignore data/* | ||
data/data1 | ||
``` | ||
|
||
$ dvc add data | ||
> Refer to `dvc check-ignore` for more details on that command. | ||
|
||
$ tree .dvc/cache | ||
## Example: Skip specific files when adding directories | ||
|
||
Let's now track the directory with `dvc add`, and see what happens in the | ||
<abbr>cache</abbr>: | ||
|
||
```dvc | ||
$ dvc add data | ||
... | ||
$ tree .dvc/cache | ||
.dvc/cache | ||
βββ 54 | ||
βΒ Β βββ 40cb5e4c57ab54af68127492334a23.dir | ||
βββ ed | ||
βββ c3d3797971f12c7f5e1d106dd5cee2 | ||
βββ 26 | ||
βΒ Β βββ ab0db90d72e28ad0ba1e22ee510510 | ||
βββ ad | ||
βββ 8b0ddcf133a6e5833002ce28f97c5a.dir | ||
$ md5 data/* | ||
b026324c6904b2a9cb4b88d6d61c81d1 data/data1 | ||
26ab0db90d72e28ad0ba1e22ee510510 data/data2 | ||
``` | ||
|
||
Only the hash values of a directory (`data/`) and one file have been | ||
<abbr>cached</abbr>. This means that `dvc add` ignored one of the files | ||
(`data1`). | ||
Only the cache entries of the `data/` directory itself and one file have been | ||
stored. Checking the hash value of the data files manually, we can see that | ||
`data2` was cached. This means that `dvc add` did ignore `data1`. | ||
|
||
> Refer to | ||
> [Structure of cache directory](/doc/user-guide/dvc-files-and-directories#structure-of-the-cache-directory) | ||
> for more info. | ||
|
||
## Example: Ignore file state changes | ||
|
||
Now, let's modify file `data1` and see if it affects `dvc status`. | ||
|
||
```dvc | ||
$ dvc status | ||
|
||
Data and pipelines are up to date. | ||
|
||
$ echo "123" >> data/data1 | ||
$ echo "2345" >> data/data1 | ||
$ dvc status | ||
|
||
Data and pipelines are up to date. | ||
``` | ||
|
||
`dvc status` also ignores `data1`. The same modification on a tracked file will | ||
produce a different output: | ||
`dvc status` ignores `data1`. Modifications on a tracked file produce a | ||
different output: | ||
|
||
```dvc | ||
$ echo "123" >> data/data2 | ||
$ echo "345" >> data/data2 | ||
$ dvc status | ||
|
||
data.dvc: | ||
changed outs: | ||
modified: data | ||
|
Uh oh!
There was an error while loading. Please reload this page.