@@ -34,8 +34,8 @@ software!
34
34
to be integrated into a Git repository history and never needs to recompute
35
35
the results after a successful merge.
36
36
37
- - ** Experiment state ** or state : Equivalent to a Git snapshot (all committed
38
- files). A Git commit hash, branch or tag name, etc. can be used as a
37
+ - ** Experiment State ** : Equivalent to a Git snapshot (all committed files). A
38
+ Git commit hash, branch or tag name, etc. can be used as a
39
39
[ reference] ( https://git-scm.com/book/en/v2/Git-Internals-Git-References ) to an
40
40
experiment state.
41
41
@@ -51,34 +51,33 @@ software!
51
51
[ DVC-files] ( /doc/user-guide/dvc-files-and-directories ) describing that data
52
52
are stored in Git for DVC needs (to maintain pipelines and reproducibility).
53
53
54
- - ** Cloud storage** support: available complement to the core DVC features. This
55
- is how a data scientist transfers large data files or shares a GPU-trained
56
- model with those without GPUs available.
54
+ - ** Cloud storage** : Available addon to the core DVC features. Multiple
55
+ providers are supported (Amazon S3, Microsoft Azure Blob Storage, Google Cloud
56
+ Storage, etc.). This is how a data scientist transfers large data files or
57
+ shares a GPU-trained model with others.
58
+
59
+ > This complement is separate from DVC itself, and never required.
57
60
58
61
## Core Features
59
62
60
- - DVC works ** on top of Git repositories** and has a similar command line
61
- interface and Git workflow.
63
+ - ** Large data file tracking** is enabled, by creating special files that point
64
+ to the original data (in the <abbr >cache</abbr >). These can be easily
65
+ versioned with Git.
62
66
63
- - It makes data science projects ** reproducible** by creating lightweight
64
- [ pipelines] ( /doc/command-reference/pipeline ) using implicit dependency graphs.
67
+ - DVC works ** on top of Git repositories** and has a similar command line
68
+ interface and flow as Git. DVC can also work stand-alone, but without
69
+ versioning capabilities.
65
70
66
- - ** Large data file versioning ** works by creating special files in your Git
67
- repository that point to the < abbr >cache</ abbr >, typically stored on a local
68
- hard drive .
71
+ - DVC makes data science projects ** reproducible ** by creating lightweight
72
+ [ pipelines ] ( /doc/command-reference/pipeline ) , using implicit dependency
73
+ graphs .
69
74
70
- - DVC is ** Programming language agnostic** : Python, R, Julia, shell scripts,
71
- etc. as well as ML library agnostic: Keras, Tensorflow, PyTorch, Scipy, etc.
75
+ - DVC is ** platform agnostic** : It runs on all major operating systems (Linux,
76
+ MacOS, and Windows), and works independently of the programming languages
77
+ (Python, R, Julia, shell scripts, etc.) or ML libraries (Keras, Tensorflow,
78
+ PyTorch, Scipy, etc.) used in the <abbr >project</abbr >.
72
79
73
- - It's ** Open-source** and ** Self-serve** : DVC is free and doesn't require any
80
+ - ** Open-source** and ** Self-serve** : DVC is free and doesn't require any
74
81
additional services.
75
82
76
- - DVC supports cloud storage (Amazon S3, Microsoft Azure Blob Storage, Google
77
- Cloud Storage, etc.) for ** data sources and pre-trained model sharing** .
78
-
79
- DVC streamlines large data files and binary models into a single Git environment
80
- and this approach will not require storing binary files in your Git repository.
81
- The diagram below describes all the DVC commands and relationships between a
82
- local cache and remote storage:
83
-
84
- ![ ] ( /img/flow-large.png ) _ DVC data management_
83
+ > Cloud storage providers are supported, however.
0 commit comments