Skip to content

cloud versioning: multiple remotes/metadata format #8356

Closed
@dberenbaum

Description

@dberenbaum

Part of #7995

Description

Cloud versioning does not support multiple remotes.

Reproduction

You can reproduce with the script below. Replace CLOUD_REMOTE_1 and CLOUD_REMOTE_2 with your own s3 paths and configure your credentials.

BUCKET=mybucket
REMOTE_1=remote1
REMOTE_2=remote2

export AWS_PROFILE=iterative-sandbox

echo "Get repo."
rm -rf repo
git init repo
cd repo
dvc init

echo "Add two cloud-versioned DVC remotes."
dvc remote add -d cloud-1 s3://$BUCKET/$REMOTE_1
dvc remote modify cloud-1 version_aware true
dvc remote modify cloud-1 worktree true
dvc remote add cloud-2 s3://$BUCKET/$REMOTE_2
dvc remote modify cloud-2 version_aware true
dvc remote modify cloud-2 worktree true
git add .
git commit -m "initialized repo"

echo "Add data"
mkdir data
echo image1 > data/image1.png
echo image2 > data/image2.png
echo model > model.h5
dvc add data
dvc add model.h5
git add .
git commit -m "add data"

echo "Push data to default remote"
dvc push
git --no-pager diff
git commit -am "push data to default remote"

echo "Push data to other remote"
dvc push -r cloud-2
git --no-pager diff # Problem 1: overwrites all version_ids
git commit -am "push data to other remote"

echo "Push to different remotes per output"
echo "  remote: cloud-2" >> model.h5.dvc
dvc push
git --no-pager diff # Problem 2: pushes model.h5 to the default remote

echo "See model.h5 versions on remote 1"
aws s3api list-object-versions --bucket $BUCKET --prefix $REMOTE_1/model.h5
echo "See model.h5 versions on remote 2"
aws s3api list-object-versions --bucket $BUCKET --prefix $REMOTE_2/model.h5

Expected

  1. DVC should keep track of the remote for each version_id, so that when pushing to a different remote, it appends a version_id for the other remote instead of overwriting it.
  2. When using the remote-per-output remote: syntax in a .dvc file, DVC should push to that remote instead of the default.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions