Skip to content

ssh: support scp-like relpaths in urls #4167

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Kuluum opened this issue Jul 4, 2020 · 8 comments
Closed

ssh: support scp-like relpaths in urls #4167

Kuluum opened this issue Jul 4, 2020 · 8 comments
Labels
feature request Requesting a new feature help wanted p3-nice-to-have It should be done this or next sprint

Comments

@Kuluum
Copy link

Kuluum commented Jul 4, 2020

Bug Report

Machine 1 (macos):

dvc version

console
DVC version: 1.1.7
Python version: 3.7.6
Platform: Darwin-19.5.0-x86_64-i386-64bit
Binary: False
Package: pip
Supported remotes: http, https, ssh
Cache: reflink - supported, hardlink - supported, symlink - supported
Repo: dvc, git

Machine 2 (ubuntu):

dvc version

DVC version: 1.1.2
Python version: 3.8.2
Platform: Linux-5.4.0-39-generic-x86_64-with-glibc2.29
Binary: False
Package: pip
Supported remotes: gdrive, http, https, ssh
Filesystem type (workspace): ('ext4', '/dev/nvme0n1p2')

Same problem on both machines:

dvc push -v

failed to upload '.dvc/cache/ff/a857a0b16c937f60da296bcd3a337e' to 'ssh://user@server/dvc/datasets/ff/a857a0b16c937f60da296bcd3a337e' - unable to create remote directory '/dvc': [Errno 13] Permission denied
------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/dvc/remote/ssh/connection.py", line 113, in makedirs
    self.sftp.mkdir(path)
  File "/usr/local/lib/python3.7/site-packages/paramiko/sftp_client.py", line 460, in mkdir
    self._request(CMD_MKDIR, path, attr)
  File "/usr/local/lib/python3.7/site-packages/paramiko/sftp_client.py", line 813, in _request
    return self._read_response(num)
  File "/usr/local/lib/python3.7/site-packages/paramiko/sftp_client.py", line 865, in _read_response
    self._convert_status(msg)
  File "/usr/local/lib/python3.7/site-packages/paramiko/sftp_client.py", line 896, in _convert_status
    raise IOError(errno.EACCES, text)
PermissionError: [Errno 13] Permission denied

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/dvc/remote/local.py", line 328, in wrapper
    func(from_info, to_info, *args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/dvc/remote/base.py", line 431, in upload
    no_progress_bar=no_progress_bar,
  File "/usr/local/lib/python3.7/site-packages/dvc/remote/ssh/__init__.py", line 268, in _upload
    no_progress_bar=no_progress_bar,
  File "/usr/local/lib/python3.7/site-packages/dvc/remote/ssh/connection.py", line 214, in upload
    self.makedirs(posixpath.dirname(dest))
  File "/usr/local/lib/python3.7/site-packages/dvc/remote/ssh/connection.py", line 109, in makedirs
    self.makedirs(head)
  File "/usr/local/lib/python3.7/site-packages/dvc/remote/ssh/connection.py", line 109, in makedirs
    self.makedirs(head)
  File "/usr/local/lib/python3.7/site-packages/dvc/remote/ssh/connection.py", line 120, in makedirs
    ) from exc
dvc.exceptions.DvcException: unable to create remote directory '/dvc'
------------------------------------------------------------

What have I found and how I fix it:

The problem is that in ssh/connection.py/makedirs parameter path comes like /dvc starts with /, this path then passed to paramiko and it can't make dir with / prefix, permission denied error appears. Same behavior if you try

$ mkdir /dvc
$ mkdir: cannot create directory ‘/dvc’: Permission denied

So when I do this:

remote/ssh/connection.py

def makedirs(self, path):
        # Single stat call will say whether this is a dir, a file or a link

        if path.startswith('/'):
            path = path[1:]
        st_mode = self.st_mode(path)
        ......

The problem with makedir is solved!

Then the new one appears:

2020-07-04 13:57:57,288 ERROR: failed to upload '.dvc/cache/1d/7f483312a6e63bf4ebb06cff427a9c' to 'ssh://user@server/dvc/datasets/1d/7f483312a6e63bf4ebb06cff427a9c' - [Errno 2] No such file
------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/dvc/remote/local.py", line 328, in wrapper
    func(from_info, to_info, *args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/dvc/remote/base.py", line 431, in upload
    no_progress_bar=no_progress_bar,
  File "/usr/local/lib/python3.7/site-packages/dvc/remote/ssh/__init__.py", line 268, in _upload
    no_progress_bar=no_progress_bar,
  File "/usr/local/lib/python3.7/site-packages/dvc/remote/ssh/connection.py", line 225, in upload
    self.sftp.put(src, tmp_file, callback=pbar.update_to)
  File "/usr/local/lib/python3.7/site-packages/paramiko/sftp_client.py", line 759, in put
    return self.putfo(fl, remotepath, file_size, callback, confirm)
  File "/usr/local/lib/python3.7/site-packages/paramiko/sftp_client.py", line 714, in putfo
    with self.file(remotepath, "wb") as fr:
  File "/usr/local/lib/python3.7/site-packages/paramiko/sftp_client.py", line 372, in open
    t, msg = self._request(CMD_OPEN, filename, imode, attrblock)
  File "/usr/local/lib/python3.7/site-packages/paramiko/sftp_client.py", line 813, in _request
    return self._read_response(num)
  File "/usr/local/lib/python3.7/site-packages/paramiko/sftp_client.py", line 865, in _read_response
    self._convert_status(msg)
  File "/usr/local/lib/python3.7/site-packages/paramiko/sftp_client.py", line 894, in _convert_status
    raise IOError(errno.ENOENT, text)
FileNotFoundError: [Errno 2] No such file
------------------------------------------------------------

The problem is the same, remote path starts with / and paramiko cant load it. So the fix is same (but in 2 places)

remote/ssh/connection.py

def upload(self, src, dest, no_progress_bar=False, progress_title=None):

        self.makedirs(posixpath.dirname(dest))
        tmp_file = tmp_fname(dest)

        # FIX
        if tmp_file.startswith('/'):
            tmp_file = tmp_file[1:]
        # ------

        if not progress_title:
            progress_title = posixpath.basename(dest)

        with Tqdm(
            desc=progress_title, disable=no_progress_bar, bytes=True
        ) as pbar:
            self.sftp.put(src, tmp_file, callback=pbar.update_to)

        # FIX
        if dest.startswith('/'):
            dest = dest[1:]
        # ------

        self.sftp.rename(tmp_file, dest)

And now push works well.

P.S.
I think this fix is too stride forward to be high quality, so I can make a pull request or after OK from you either after comment on how to make it better.

@ghost ghost added the triage Needs to be triaged label Jul 4, 2020
@efiop
Copy link
Contributor

efiop commented Jul 4, 2020

Hi @Kuluum !

dvc treats the paths in url as absolute ones, hence why it tries to create /dvc. What you want is a relative path behaviour, where we pass a relpath to sftp and so it uses it based on the home directory for the user you are accessing the server as.

IIRC, utils like scp accept both path and /path in their url after :. But at the same time, you won't be able to specify the port in that url and will have to use -p for that. If you try to do that right now in dvc, you'll get an error about us not being able to cast string to int.

The current workaround is to use an abspath like /home/user/dvc instead of /dvc in your url. Would that work for you for now?

As to a proper solution, we clearly need some in-url way to differentiate abs path from a relative one and treat them properly. We could make our parsing smarter so it understands :my/path(and :22:my/path).

@efiop efiop changed the title ssh push [Errno 13] Permission denied ssh: support scp-like relpaths in urls Jul 4, 2020
@efiop efiop added the feature request Requesting a new feature label Jul 4, 2020
@ghost ghost removed the triage Needs to be triaged label Jul 4, 2020
@efiop efiop added help wanted p3-nice-to-have It should be done this or next sprint labels Jul 4, 2020
@Kuluum
Copy link
Author

Kuluum commented Jul 5, 2020

@efiop Thanks! With the absolute path, all work well. Also, I see that usage of the absolute path is described in the documentation, but I didn't notice it. For some reason, I used to expect the path '/dvc/datasets' to be relative to the ssh connect folder. My user has no rights to write to the root folder and it's the real reason why it cant create this folders.

@efiop
Copy link
Contributor

efiop commented Jul 5, 2020

I used to expect the path '/dvc/datasets' to be relative to the ssh connect folder

Do you remember any examples like that? I'm having a trouble remembering anything like that, but maybe you used to use some sftp servers that run in a user home dir? Usually all sftp servers are started with a root in a real fs root /, but sometimes they are configured to use a different root dir, I've seen that happen (there are even some issues that we have where people were confused by it). So maybe you could consider configuring your sftp server to behave like that too? Though that would be more confusing than just using abs paths, in my opinion.

So you are using abs path (e.g. /home/user/dvc) for now as a workaround, right? Just double checking that you are not blocked by this anymore 🙂

@Kuluum
Copy link
Author

Kuluum commented Jul 6, 2020

Using the abs path as a workaround works. I'm not blocked anymore. 👍

What about sftp and root dir... Server guys created a user for me on some common server where are some other сolleagues have users too, so I have powers only in my user's home folder ☺️ (I'm not very good at server things, so I do not know the features of sftp and other stuff)

@efiop
Copy link
Contributor

efiop commented Jul 10, 2020

For the record: seems like we have a very similar problem with hdfs, where we always pass /path to pyarrow, but it actually supports path too, but uses what seems to be /user/efiop as root for that. Will need to take a closer look at it as well.

@shcheklein
Copy link
Member

I got confused by this also and took a cycle to realize what's happening. I think we should at least document this properly /cc @jorgeorpinel .

@jorgeorpinel
Copy link
Contributor

jorgeorpinel commented Aug 1, 2020

Not 100% sure I got the issue, workaround, and what's missing from docs, but here's a small PR for you guys to check please: iterative/dvc.org#1649

@efiop
Copy link
Contributor

efiop commented Dec 8, 2023

Closing as stale

@efiop efiop closed this as not planned Won't fix, can't repro, duplicate, stale Dec 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Requesting a new feature help wanted p3-nice-to-have It should be done this or next sprint
Projects
None yet
Development

No branches or pull requests

4 participants