Skip to content

remote: separate cloud remote and cloud cache classes #4019

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Jun 12, 2020

Conversation

pmrowla
Copy link
Contributor

@pmrowla pmrowla commented Jun 11, 2020

  • ❗ I have followed the Contributing to DVC checklist.

  • πŸ“– If this PR requires documentation updates, I have created a separate PR (or issue, at least) in dvc.org and linked it here. If the CLI API is changed, I have updated tab completion scripts.

  • ❌ I will check DeepSource, CodeClimate, and other sanity checks below. (We consider them recommendatory and don't expect everything to be addressed. Please fix things that actually improve code or fix bugs.)

Thank you for the contribution - we'll try to review it as soon as possible. πŸ™

Related to #3882

  • Supported cloud types now have distinct remote and cache classes (i.e. S3Remote, S3Cache, LocalRemote, LocalCache, etc)
  • All get_checksum related functions are now moved into remote trees
  • cache.save() now takes an explicit tree parameter
    • When tree is remote.tree, save is done via move() + link()
    • For all other trees, save is done by copying object from tree into remote.tree

@property
def state(self):
return self.repo.state

def get(self, md5):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this function needed? It looks like it was leftover from some old behavior, and is local only (not found in BaseRemote). It's still being tested in tests/func/test_data_cloud but I couldn't find anywhere else in DVC that it's actually used.

pass


class CacheMixin:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cloud cache is technically not an extension to cloud remote, since we define remote as the cloud push/pull mirror for a local .dvc MD5 checksum-based cache. However, cloud caches still need to implement the remote methods for size estimation and garbage collection (so that we can display progress during gc).

Originally, I had separate BaseCloud, RemoteMixin, CacheMixin, classes, but it seemed strange to have it like that if every cache class was always going to include the RemoteMixin methods. So for now all the cloud cache classes are implemented as an extension of a cloud remote class + the cache mixin.

@pmrowla pmrowla force-pushed the separate-remote-cache branch from d51da92 to 2a690a4 Compare June 12, 2020 08:13
@pmrowla
Copy link
Contributor Author

pmrowla commented Jun 12, 2020

Will need rebase after #3991 is merged to resolve test_update_import_after_remote_updates_to_dvc test failure (due to a State issue)

@pmrowla pmrowla marked this pull request as ready for review June 12, 2020 09:01
@pmrowla pmrowla force-pushed the separate-remote-cache branch from 2a690a4 to f3a6a1e Compare June 12, 2020 10:20
@@ -26,21 +35,30 @@
]


def _get(remote_conf):
for remote in REMOTES:
def _get(remote_conf, remotes, default):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@efiop I think cleaning this up to use tree.supported(...) will also require moving scheme and the remaining constants into the tree classes like we discussed. And then once we have single Remote and Cache classes that take a tree as a parameter, we'll be able to get rid of these helper methods.

I think it would be best to leave this as-is for now, and then handle it in the next PR

@pmrowla pmrowla changed the title [WIP] remote: separate cloud remote and cloud cache classes remote: separate cloud remote and cloud cache classes Jun 12, 2020
@pmrowla pmrowla requested a review from efiop June 12, 2020 11:39
@efiop efiop merged commit 9ead641 into iterative:master Jun 12, 2020
@pmrowla pmrowla deleted the separate-remote-cache branch July 13, 2020 09:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants