Skip to content

Support revisions in dvc update #2849

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dmpetrov opened this issue Nov 26, 2019 · 9 comments · Fixed by #3337
Closed

Support revisions in dvc update #2849

dmpetrov opened this issue Nov 26, 2019 · 9 comments · Fixed by #3337
Assignees
Labels
enhancement Enhances DVC feature request Requesting a new feature

Comments

@dmpetrov
Copy link
Member

dmpetrov commented Nov 26, 2019

dvc update --rev hello_world file.dvc

it is still not supported but it seems like a very handy alternative to re-importing:

dvc import --rev hello_world https://github.com/dmpetrov/dataset file
@dmpetrov dmpetrov added the enhancement Enhances DVC label Nov 26, 2019
@efiop efiop added feature request Requesting a new feature p1-important Important, aka current backlog of things to do labels Nov 26, 2019
@jorgeorpinel
Copy link
Contributor

jorgeorpinel commented Nov 26, 2019

Copying main part of the discussion about this from Slack:

@Suor: When you you do dvc import -r ... on an existing import stage it rewrites revisions thus switching a tag or a branch.

Dmitry: ...we need a shortcut for re-importing/import, and update looks like a reasonable alternative...

Alexander: You mean you want avoid retyping the url?

Dmitry: Yes. Also, update seems a proper command name for updating version.
I understand that switching branches is kind of exception, but this is not what I usually do with imported datasets.

@jorgeorpinel
Copy link
Contributor

Updating vs. re-importing is definitely a source of confusion, and it's reflected in our docs (which also requires the term "fixed-revision import". See the following excerpts:

From https://dvc.org/doc/command-reference/import#example-fixed-revisions-re-importing:

If the Git revision moves (e.g. a branch), you may use dvc update to bring the data up to date. However, for typically static references (e.g. tags), or for SHA commits, in order to actually "update" an import, it's necessary to re-import the data instead, by using dvc import again without or with a different --rev. This will overwrite the import stage...

From https://dvc.org/doc/command-reference/update#examples:

For typically static references (e.g. tags), or for SHA commits, dvc update will not have any effect on the import. Refer to the re-importing example to learn how to "update" fixed-revision imports.

From https://dvc.org/doc/use-cases/data-registry#example (in an expandable section):

In order to actually "update" it, do not use dvc update. Instead, re-import the data

@jorgeorpinel
Copy link
Contributor

jorgeorpinel commented Nov 26, 2019

So I like the idea about dvc update --rev because it will probably simplify all these explanations, although it will also require changing its command reference to explain that there are 2 types of updates supported:

  • Simply look for changes in the import as defined in its DVC-file, respecting url AND rev (if present) – only updating the actual data and rev_lock field.
  • Change the rev AND look for changes, updating data and rev_lock as well.

Alternatively we could introduce a new command or subcommand such as dvc import move to change an import stage's rev value, in order to then dvc update it.

@ilgooz
Copy link

ilgooz commented Dec 21, 2019

Hey, I'll give it a try to create a solution for this feature!

@ilgooz
Copy link

ilgooz commented Jan 4, 2020

@dmpetrov hey, what is the expected behavior after dvc update --rev another-rev, should it lock original import to another-rev or do not create side effects like that?

current behaviour is, it locks to another-rev but also pulls the rev each time from the remote which I think it should just cache if the last commits are the same?

@efiop
Copy link
Contributor

efiop commented Jan 4, 2020

@ilgooz dvc update and dvc update --rev latest_rev should do the same, so yes, it should lock to another-rev, because that is how dvc update currently works too.

Do you mean that it is pulling the rev on each update? If so, it is how our current caching is implemented. There is no need to tackle that in this PR, feel free to create an issue for it though 🙂

@ilgooz
Copy link

ilgooz commented Jan 4, 2020

@ilgooz dvc update and dvc update --rev latest_rev should do the same, so yes, it should lock to another-rev, because that is how dvc update currently works too.

Do you mean that it is pulling the rev on each update? If so, it is how our current caching is implemented. There is no need to tackle that in this PR, feel free to create an issue for it though 🙂

Thanks, yes, I asked two different Qs and got my answers!

@skshetry
Copy link
Collaborator

@efiop, why is this p0 btw?

@efiop
Copy link
Contributor

efiop commented Feb 15, 2020

@skshetry Just trying to unblock @andronovhopf . Thank you so much for the fix! 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhances DVC feature request Requesting a new feature
Projects
None yet
5 participants