migrate use of upload_fobj to use transfer #6558

skshetry · 2021-09-08T09:55:01Z

Migrate away from upload_fobj to fs.utils.transfer
Also extend fs.utils.transfer to support file object upload and bytes upload.

Pre-requisite for merging fs.upload and fs.upload_fobj together. fs.upload_fobj won't be able to support callback.get_size as we don't have that information in the fs.upload as there's no src_fs and src_info to query to, so we are achieving that consistency in terms of callback support with the use of fs.utils.transfer.

Also pre-requisite for #6546.

skshetry · 2021-09-08T09:55:31Z

dvc/fs/utils.py

 ) -> None:
    use_move = isinstance(from_fs, type(to_fs)) and move
    try:
+        if content:


This is a bit overloaded though, but moving with it for now.

skshetry · 2021-09-08T10:04:42Z

dvc/fs/utils.py

+            else:
+                fobj = content
+                size = from_fs.getsize(from_info)
+            return to_fs.upload_fobj(fobj, to_info, size=size)


fs.utils.transfer is the only place where we use upload_fobj now.

dvc/fs/utils.py

skshetry · 2021-09-08T13:33:25Z

dvc/objects/db/reference.py

+        fs.utils.transfer(
+            from_fs, from_info, self.fs, to_info, move=move, content=content
+        )


@pmrowla, what do you think of this? Does transfer fit here, or should it use more primitive functions like pipe_file?
(although I want to use less API as possible)

Thinking about it again, I have doubts about this here, I only did it for simplifying the exception handling. 😅

The upload_fobj for reference objects isn't really a file transfer, it's just a write() of the serialized reference data from memory.

It seems to me like we should still have a distinction between a file transfer and a direct binary data write/upload, since in this case there really isn't a from_fs and from_info that we are transferring.

One alternative would maybe be adding something like ReferenceHashFile.to_memfile() where it returns a memfs://some_tmp_pathname path to a MemoryFileSystem entry (containing the serialized data). And then we could do an actual "file transfer" using the memfs and memfs_path_info without needing the content field?

(but this seems a bit more convoluted than just having an fs.utils.write method)

yeah, initially I started with two methods, but at the end fs.utils.write is going to be equivalent to upload_fobj, so I don't see much difference there. With transfer, I tried to push the problem of getting size for the callback support inside the transfer itself, rather than putting that burden on the caller.

We do the following when transferring files straight to the remote, which is just a file transfer at the end:

with fs.open(path_info, mode="rb", chunk_size=fs.CHUNK_SIZE) as stream: stream = HashedStreamReader(stream) upload_odb.fs.upload_fobj( stream, tmp_info, desc=path_info.name, size=fs.getsize(path_info) )

But maybe the best thing to do here is put that burden on the caller itself, rather than trying to generalize it in transfer.

Also extend `fs.utils.transfer` to support file object upload and bytes upload.

skshetry · 2021-09-09T04:47:56Z

Closing this PR as we'll likely put the burden of doing callback.set_size on the caller itself.

See #6558 (comment).

skshetry added the refactoring Factoring and re-factoring label Sep 8, 2021

skshetry requested review from pmrowla, efiop and isidentical September 8, 2021 09:55

skshetry self-assigned this Sep 8, 2021

skshetry requested a review from a team as a code owner September 8, 2021 09:55

skshetry commented Sep 8, 2021

View reviewed changes

dvc/fs/utils.py Outdated Show resolved Hide resolved

skshetry commented Sep 8, 2021

View reviewed changes

dvc/fs/utils.py Outdated Show resolved Hide resolved

isidentical approved these changes Sep 8, 2021

View reviewed changes

skshetry force-pushed the migrate-upload-fobj-to-transfer branch from 8a9551a to 959c16a Compare September 8, 2021 10:42

skshetry commented Sep 8, 2021

View reviewed changes

dvc/fs/utils.py Outdated Show resolved Hide resolved

skshetry force-pushed the migrate-upload-fobj-to-transfer branch from f0d4573 to 959c16a Compare September 8, 2021 12:50

efiop approved these changes Sep 8, 2021

View reviewed changes

skshetry commented Sep 8, 2021

View reviewed changes

skshetry force-pushed the migrate-upload-fobj-to-transfer branch from 959c16a to b8e5e50 Compare September 8, 2021 13:35

migrate use of upload_fobj directly to use transfer

03fa83a

Also extend `fs.utils.transfer` to support file object upload and bytes upload.

skshetry force-pushed the migrate-upload-fobj-to-transfer branch from b8e5e50 to 03fa83a Compare September 8, 2021 13:42

skshetry closed this Sep 9, 2021

skshetry deleted the migrate-upload-fobj-to-transfer branch September 9, 2021 04:48

skshetry mentioned this pull request Sep 9, 2021

fs: merge upload_fobj and upload #6570

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

migrate use of upload_fobj to use transfer #6558

migrate use of upload_fobj to use transfer #6558

Uh oh!

skshetry commented Sep 8, 2021

Uh oh!

skshetry Sep 8, 2021 •

edited

Loading

Uh oh!

skshetry Sep 8, 2021 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

skshetry Sep 8, 2021 •

edited

Loading

Uh oh!

pmrowla Sep 9, 2021 •

edited

Loading

Uh oh!

skshetry Sep 9, 2021 •

edited

Loading

Uh oh!

skshetry commented Sep 9, 2021 •

edited

Loading

Uh oh!

Uh oh!

migrate use of upload_fobj to use transfer #6558

migrate use of upload_fobj to use transfer #6558

Uh oh!

Conversation

skshetry commented Sep 8, 2021

Uh oh!

skshetry Sep 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

skshetry Sep 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

skshetry Sep 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pmrowla Sep 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

skshetry Sep 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

skshetry commented Sep 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

skshetry Sep 8, 2021 •

edited

Loading

skshetry Sep 8, 2021 •

edited

Loading

skshetry Sep 8, 2021 •

edited

Loading

pmrowla Sep 9, 2021 •

edited

Loading

skshetry Sep 9, 2021 •

edited

Loading

skshetry commented Sep 9, 2021 •

edited

Loading