Dropping .dir

In #8849, we stopped serializing the directory info in the resulting .dvc/dvc.lock files for cloud versioned remotes. Can we do the same everywhere?

This would help with a bunch of existing issues:
* #7661 and #8875: `dvc diff` fails to compare refs unless the data associated with those refs has been pulled locally. `dvc data status` also reports an `unknown` status when data hasn't been pulled. By having all the files listed in the .dvc/dvc.lock file, it would always be possible to get the granular file info of any commit.
* #8872: `dvc pull` on an `import-url` target is now supposed to be able to pull the data directly from the source without having to push a copy to the remote, but it doesn't work for directories because only the high-level directory info is saved to the .dvc file.
* #4657: To modify an existing directory, the whole directory needs to be pulled. Having the granular file info in the .dvc file means that users could delete a file by searching for it in the .dvc file and deleting that entry. This still isn't great UX, but it should be easy to make `dvc add/remove` work at a granular level by only modifying part of the .dvc file.
* #8638: Users have to install a special merge driver because of .dir entries. Even then, a merge conflict becomes hard to troubleshoot because the conflict will not show both .dir entries, and even if it did, there's no easy way to combine them. With granular file info instead of the .dir entries, no merge driver is needed, and merge conflicts could be resolved by editing the file info in the .dvc files.

Automatically pushing and pulling the .dir files from the remote could also solve a lot of these problems, but it seems like a worse UX. It's less transparent, harder for users to manage, and fails when users don't have access to the remote or forgot to push something.

How much do we really need the reference to the .dir file? If necessary, could we serialize that reference somewhere that's not git-tracked, like in a shadow `.dvc/tmp/mydataset.dvc`file?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Dropping .dir #8884

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Dropping .dir #8884

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions