-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Secsh channel 10 open FAILED: open failed: Connect failed #3482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@brorfred, can you paste the dvc push -j 1 |
Thanks for looking into this! Running the following command:
Generates the following output:
And the log file: |
@brorfred, My google-fu returned this answer from Unix StackExchange: https://unix.stackexchange.com/a/459051 It basically says that this happens if you get over resource limit, eg: no. of open file descriptors. And, please respond if it solved the issue as it will help other users as well. |
I'm starting to realize that the error probably is some annoying kitchen sink response that can be pretty much anything. Most of my resource limits are set to infinite: ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 256902
max locked memory (kbytes, -l) 16384
max memory size (kbytes, -m) unlimited
open files (-n) 20000
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 256902
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited |
The strange thing is that scp works fine: touch test
scp test [email protected]:/media/sda2/dvc/first_day/
test 100% 0 0.0KB/s 00:00 |
@brorfred, looks like the issue is on server side. Closing it for now, feel free to re-open if you find that the issue is on |
It seems like the problem is in paramico and particularly in paramiko.ssh_exception.ChannelException. The class SSHConnection in dvc.remote.ssh.connection assumes that the exception fails quietly in open_max_sftp_channels but paramiko logs a warning standard error. Not sure if this is something specific with our setup but I get the same behavior on my Mac laptop... from dvc.remote.ssh import connection
ssh = connection.SSHConnection(host="zotac.bror.us", username="bror", port=22)
ssh.open_max_sftp_channels()
Secsh channel 10 open FAILED: open failed: Connect failed
Out[122]:
[<paramiko.sftp_client.SFTPClient at 0x7fe1321fac50>,
<paramiko.sftp_client.SFTPClient at 0x7fe1321be150>,
<paramiko.sftp_client.SFTPClient at 0x7fe132239990>,
<paramiko.sftp_client.SFTPClient at 0x7fe132202950>,
<paramiko.sftp_client.SFTPClient at 0x7fe1321fa050>,
<paramiko.sftp_client.SFTPClient at 0x7fe13219ed50>,
<paramiko.sftp_client.SFTPClient at 0x7fe13219df90>,
<paramiko.sftp_client.SFTPClient at 0x7fe137584850>,
<paramiko.sftp_client.SFTPClient at 0x7fe13218f990>,
<paramiko.sftp_client.SFTPClient at 0x7fe13220a4d0>]
#Redo the command
ssh.open_max_sftp_channels()
Out[123]:
[<paramiko.sftp_client.SFTPClient at 0x7fe1321fac50>,
<paramiko.sftp_client.SFTPClient at 0x7fe1321be150>,
<paramiko.sftp_client.SFTPClient at 0x7fe132239990>,
<paramiko.sftp_client.SFTPClient at 0x7fe132202950>,
<paramiko.sftp_client.SFTPClient at 0x7fe1321fa050>,
<paramiko.sftp_client.SFTPClient at 0x7fe13219ed50>,
<paramiko.sftp_client.SFTPClient at 0x7fe13219df90>,
<paramiko.sftp_client.SFTPClient at 0x7fe137584850>,
<paramiko.sftp_client.SFTPClient at 0x7fe13218f990>,
<paramiko.sftp_client.SFTPClient at 0x7fe13220a4d0>] |
Yes, the text is sent to a logger object in the paramiko instance in transport.py line 2700. Would it be possible for dvc to replace that logger with your own? |
Pretty strange that that log message even comes up like that. Dvc disables all other loggers in CLI https://github.com/iterative/dvc/blob/0.88.0/dvc/main.py#L37 , so that error shouldn't really show up at all. Maybe paramiko sets the logger back up during the runtime, causing its errors to show up again. 🤔 |
We load paramiko lazily, so, I guess, the logger is not active at the point when we try to disable it. Probably, we need to bring back the logger dict configs that were before for paramiko. |
I found a few discussions regarding this specific log message when it was implemented: #2131 (comment) Of course, we need to silent the log, but also need to see why the message is there. |
The message in itself is benign I think. You poll paramiko for max number of channels by open them one by one until max number is reached and paramiko raises an exception. I still don't understand why I'm the only one that get's the warning message though. I've tried two different setups (my office linux machine and my Mac laptop) and both generates it. Could it be because I've installed dvc in a conda environment? I tried to create a minimal environment with only pip paramiko and dvc installed and got the same warnings... |
@brorfred Not sure, maybe it is a server-related thingy. Still, the fact that paramiko messages sip through the cracks is not great, we need to disable it when we import it or maybe adjust logging configs somehow. |
Our After reverting to v0.41.3, which is before #2131 was merged, we could now use |
@hg-zt |
When doing
Details: |
@gurunath-p Thanks for the report! A workaround is to run But since we are getting more reports it makes me think whether some of our recent patches have made dvc go over the reasonable limit. CC @pmrowla |
I'll look into this, although out of the recent patches, I don't think any of the relevant changes should have affected ssh remotes, since RemoteSSH still has its own overridden |
Retried for a few more times and the rest was uploaded. I also found when running |
I think the issue is just that we are not suppressing the paramiko logger. We hit the channel limit when determining the max number of channels and trigger the message, even though it's expected and not an error for DVC. |
Please provide information about your setup
DVC version(i.e.
dvc --version
), Platform and method of installation (pip, homebrew, pkg Mac, exe (Windows), DEB(Linux), RPM(Linux))version 0.88.0
installed via pip on Linux and Mac.
Syncing to a Ubuntu host via ssh.
Everything seems to be working but I get a cryptic warning message every time I do anything. For example:
dvc push
0% Querying cache in ssh://[email protected]/media/sda2/dvc/first_day| |0/42 Secsh channel 10 open FAILED: open failed: Connect failed
Secsh channel 10 open FAILED: open failed: Connect failed
Secsh channel 10 open FAILED: open failed: Connect failed
2% /media/sda2/dvc/first_day/4e/4b31f0c5784a2e185d88a3120cac19| |1/42 [00:02<0Secsh channel 10 open FAILED: open failed: Connect failed
Everything is up to date.
This is probably an edge case due to my setup but I'm not sure how to quiet the message or resolve the issue.
The text was updated successfully, but these errors were encountered: