Skip to content

dvc push doesn't update cloud info with cloud versioned remotes #9947

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
shcheklein opened this issue Sep 14, 2023 · 4 comments
Closed

dvc push doesn't update cloud info with cloud versioned remotes #9947

shcheklein opened this issue Sep 14, 2023 · 4 comments
Labels
bug Did we break something?

Comments

@shcheklein
Copy link
Member

Context: #9907 (reply in thread)

Let's say we have two remotes:

['remote "dev"']
    url = s3://yury-cloud-versioning-test/test-dev
    version_aware=true
['remote "prod"']
    url = s3://yury-cloud-versioning-test/test-prod
    version_aware=true

Let's say we need to migrate data from one to another.

I would expect commands like this:

dvc pull -r dev
dvc push -r prod

to work and update .dvc \ dvc.lock files with an appropriate info, in reality I'm getting:

(.venv) √ Projects/test-cloud-versioned % git diff
diff --git a/test.txt.dvc b/test.txt.dvc
index d5fbaa9..03f455c 100644
--- a/test.txt.dvc
+++ b/test.txt.dvc
@@ -4,6 +4,6 @@ outs:
   hash: md5
   path: test.txt
   cloud:
-    dev:
+    prod:
       etag: d8e8fca2dc0f896fd7cb4cb0031ba249
       version_id: UK3s0VcueuAIttMw7FROG8pRospYWNQI

(only remote name is updated, info stays the same, which is wrong for that remote).

In the original issue, even the object is not pushed to the new remote.

Also, in case of cloud versioning I think prod / dev don't make much sense in .dvc \ dvc.lock. Version_id is unique (I assume) and can't repeat in a different location. I guess we need to use some hash, or location itself in this case. How do we use these names at all? do we expect that specific remote name to exist in a config?

@shcheklein shcheklein added the bug Did we break something? label Sep 14, 2023
@dberenbaum
Copy link
Contributor

Related:
#8356
#8862

@skshetry Can you remember what the expected behavior is here? Should we be overwriting the remote info? Or disallowing this operation?

@pmrowla
Copy link
Contributor

pmrowla commented Sep 21, 2023

How do we use these names at all? do we expect that specific remote name to exist in a config?

Yes, they are tied to the remote name defined in that git commit's .dvc/config.

Can you remember what the expected behavior is here

We should be pushing the file to the prod remote and updating the version ID and etag here.

I'm guessing this is an index bug caused by both remotes using the same s3 bucket. It looks like the user from the original discussion/context is using dvc==3.0.0 so this may just be a duplicate of #9904 (which is fixed in the latest release)

@shcheklein
Copy link
Member Author

@pmrowla I think I was able to reproduce it on the recent DVC version.

@dberenbaum dberenbaum added the p1-important Important, aka current backlog of things to do label Oct 20, 2023
@dberenbaum dberenbaum added this to DVC Oct 20, 2023
@github-project-automation github-project-automation bot moved this to Backlog in DVC Oct 20, 2023
@dberenbaum
Copy link
Contributor

I can also reproduce this with a single remote by changing the remote path. DVC will update the remote name but won't notice that it needs to push again.

It seems like DVC isn't checking whether the version IDs actually exist. I remember discussing this in #8766 but I think @pmrowla rightly saw it as dangerous and we continued to check which version were available.

@dberenbaum dberenbaum moved this from Backlog to Todo in DVC Nov 14, 2023
@efiop efiop self-assigned this Jan 9, 2024
@dberenbaum dberenbaum moved this from Todo to Backlog in DVC Feb 6, 2024
@dberenbaum dberenbaum removed the p1-important Important, aka current backlog of things to do label Mar 4, 2024
@dberenbaum dberenbaum closed this as not planned Won't fix, can't repro, duplicate, stale Mar 25, 2024
@skshetry skshetry moved this from Backlog to Done in DVC May 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Did we break something?
Projects
Archived in project
Development

No branches or pull requests

5 participants