-
Notifications
You must be signed in to change notification settings - Fork 781
OpenSSH for Windows often hangs if no data sent over the connection #1338
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Unable to reproduce this on a connection with ~35ms latency. |
I've attached a dump of a deadlocked OpenSSH 7.9. |
Topically, it appears that it might be waiting for the asynchronous write of "Connection to 172.217.225.62 closed." to stderr to finish. Can you also post the output when "-vvv" is set (assuming that passing that doesn't fix the issue 100% of the time)? |
@EricSL If you're willing to help debug (and presuming that a verbose output can't be produced), I'd like to provide you a private version with some additional debug variables to trace where things could be going wrong. |
"-vvv" reliably fixes the problem. Send me a patch and I'll make a custom build. |
@EricSL I was finally able to reproduce on a low-latency connection this but the problem is quite mysterious. It almost appears that after one part of the code calls TerminateThread() that no other new threads can be spawned. I'm still probing since I'm not familiar with this part of the code, but hopefully will have an update in a few days. |
- Replaced TerminateThread() call with an interrupt routine to gracefully call _endthreadex(0). - Resolves PowerShell/Win32-OpenSSH#1338.
@EricSL Please try these changes: PowerShell/openssh-portable@latestw_all...NoMoreFood:hanging_issue |
If interested, you can test using these binaries: https://github.com/NoMoreFood/openssh-portable/releases/tag/v7.9-merge-1 |
@NoMoreFood Your patch fixes the issue for me, thanks! |
@NoMoreFood |
@pakona Unfortunately, I cannot speak for the release timeline nor whether this patch will be ultimately be accepted. I get the sense the maintainers are preoccupied with other tasks at the moment so I can only hope there is a surge of attention on this fork at some point. |
@bingbing8 any chance we could get some idea of when this patch could be accepted? |
This problem is reproducing for me again, even with the patch. What I'm seeing now: Main thread:
It appears to be waiting for WaitForSingleObject to return, in your patched code.
Worker Thread:
It is waiting for ReadFile to return, same pio, nBytesReturned = 0 |
Okay, apparently the problem with the patched syncio_close() is that I am not running on Win7 and in_raw_mode is 0 so it never calls CancelSynchronousIo. If I change the condition to just |
I'd have to defer to @manojampalam since I didn't didn't investigate the Windows 7 / raw mode circumstances that let to this conditional in the first place. |
- Replaced TerminateThread() call with an interrupt routine to gracefully call _endthreadex(0). - Resolves PowerShell/Win32-OpenSSH#1338.
- Replaced TerminateThread() call with an interrupt routine to gracefully call _endthreadex(0). - Resolves PowerShell/Win32-OpenSSH#1338.
Troubleshooting steps
https://github.com/PowerShell/Win32-OpenSSH/wiki/Troubleshooting-Steps
Terminal issue? please go through wiki
https://github.com/PowerShell/Win32-OpenSSH/wiki/TTY-PTY-support-in-Windows-OpenSSH
Please answer the following
"OpenSSH for Windows" version
7.7p1, 7.9p1, much less common on 7.6p1 but I've seen it on that version too.
Server OperatingSystem
Debian GNU/Linux 9.4 (stretch)
Client OperatingSystem
Windows 10
What is failing
Possible duplicate of #1334. Also filed at https://bugzilla.mindrot.org/show_bug.cgi?id=2964
This was reproduced on both 7.7p1 and 7.9p1, while ssh-ing from a Windows machine to a Linux machine. I have also reproduced it on 7.6p1 but it happens much more rarely. You may have to run a command several times in a row to get it to fail.
For a simple repro I'm using the bash commands
echo ""
(should output a single newline) andecho -n ""
(should output nothing and immediately exit successfully; I've also used /bin/true which is equivalent.)Reliably returns right away as expected.
Reliably returns right away as expected.
Seems to hang, but if you press a key it returns.
Sometimes it works, sometimes it hangs. When it hangs it will be unresponsive to input, including ^C. You need to kill it in task manager. A wireshark trace shows that the server sent the TCP FIN packet, but the client is still holding open the connection.
Turning on verbose output seems to make it work reliably.
I also have reproduced with the command
sleep .001
. Changing to .01 makes it reproduce less frequently, and 1 second works reliably. So it seems to be a race condition involving a very short connection with no data sent.Not sure to what extent network latency affects this but my ping time is 11ms so you may need something similarly distant.
Expected output
All of these commands should return right away.
Actual output
Some of these commands hang, either waiting for input (if -tt is not specified) or ssh becomes unresponsive (if -tt is specified).
The text was updated successfully, but these errors were encountered: