Skip to content

Hub handshake failed with error 'WebSocket is not in the OPEN state' SignalR WebSockets Angular client with AutoReconnect() #17613

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
paramitamandal opened this issue Dec 5, 2019 · 28 comments · Fixed by #18511
Assignees
Labels
area-signalr Includes: SignalR clients and servers bug This issue describes a behavior which is not expected - a bug.

Comments

@paramitamandal
Copy link

paramitamandal commented Dec 5, 2019

My application was first built on dotnetcore 2.2, then it was upgraded to dotnetcore 3.0 and it was working properly. But WithAutoReconnect() from Angular web app and all the default settings, gives following log:

[2019-12-05T12:10:14.538Z] Information: Reconnect attempt number 3 will start in 10000 ms.
zone.js:3372 OPTIONS http://127.0.0.1:9938/host/negotiate?clientIdentifier=b462ed0d-4df0-4f2d-82b1-4ad1ee72b09b net::ERR_CONNECTION_REFUSED
Utils.js:208 [2019-12-05T12:10:24.540Z] Debug: Starting connection with transfer format 'Text'.
Utils.js:208 [2019-12-05T12:10:24.540Z] Debug: Sending negotiation request: http://127.0.0.1:9938/host/negotiate?clientIdentifier=b462ed0d-4df0-4f2d-82b1-4ad1ee72b09b.
Utils.js:208 [2019-12-05T12:10:24.683Z] Debug: Selecting transport 'WebSockets'.
Utils.js:208 [2019-12-05T12:10:24.683Z] Trace: (WebSockets transport) Connecting.
Utils.js:204 [2019-12-05T12:10:24.723Z] Information: WebSocket connected to ws://127.0.0.1:9938/host?clientIdentifier=b462ed0d-4df0-4f2d-82b1-4ad1ee72b09b&id=KOrhcPrFdlHaeTzcJxZjlA.
Utils.js:208 [2019-12-05T12:10:24.724Z] Debug: The HttpConnection connected successfully.
Utils.js:208 [2019-12-05T12:10:24.724Z] Debug: Sending handshake request.
Utils.js:208 [2019-12-05T12:10:24.743Z] Debug: Hub handshake failed with error 'WebSocket is not in the OPEN state' during start(). Stopping HubConnection.
Utils.js:208 [2019-12-05T12:10:24.744Z] Trace: (WebSockets transport) socket closed.
Utils.js:208 [2019-12-05T12:10:24.744Z] Debug: HttpConnection.stopConnection(undefined) called while in state Disconnecting.
Utils.js:198 [2019-12-05T12:10:24.744Z] Error: Connection disconnected with error 'WebSocket is not in the OPEN state'.
Utils.js:208 [2019-12-05T12:10:24.745Z] Debug: HubConnection.connectionClosed(WebSocket is not in the OPEN state) called while in state Reconnecting.
Utils.js:204 [2019-12-05T12:10:24.745Z] Information: Reconnect attempt failed because of error 'WebSocket is not in the OPEN state'.
Utils.js:204 [2019-12-05T12:10:24.745Z] Information: Reconnect attempt number 4 will start in 30000 ms.
Utils.js:208 [2019-12-05T12:10:54.747Z] Debug: Starting connection with transfer format 'Text'.
Utils.js:208 [2019-12-05T12:10:54.748Z] Debug: Sending negotiation request: http://127.0.0.1:9938/host/negotiate?clientIdentifier=b462ed0d-4df0-4f2d-82b1-4ad1ee72b09b.
Utils.js:208 [2019-12-05T12:10:54.773Z] Debug: Selecting transport 'WebSockets'.
Utils.js:208 [2019-12-05T12:10:54.773Z] Trace: (WebSockets transport) Connecting.
Utils.js:204 [2019-12-05T12:10:54.780Z] Information: WebSocket connected to ws://127.0.0.1:9938/host?clientIdentifier=b462ed0d-4df0-4f2d-82b1-4ad1ee72b09b&id=JEu_f9AcM9fx4OE3SMWEVg.
Utils.js:208 [2019-12-05T12:10:54.780Z] Debug: The HttpConnection connected successfully.
Utils.js:208 [2019-12-05T12:10:54.781Z] Debug: Sending handshake request.
Utils.js:208 [2019-12-05T12:10:54.781Z] Trace: (WebSockets transport) sending data. String data of length 32.
Utils.js:204 [2019-12-05T12:10:54.782Z] Information: Using HubProtocol 'json'.
Utils.js:208 [2019-12-05T12:10:54.829Z] Trace: (WebSockets transport) data received. String data of length 3.
Utils.js:208 [2019-12-05T12:10:54.830Z] Debug: Server handshake complete.
Utils.js:204 [2019-12-05T12:10:54.831Z] Information: HubConnection reconnected successfully.

The server is disconnected and client tries to reconnect. When server goes up and connected; then the following steps occurred:

  1. 3rd retry attempted and though "The HttpConnection connected successfully." but "Hub handshake failed with error 'WebSocket is not in the OPEN state' during start(). Stopping HubConnection" and (WebSockets transport) socket closed. HttpConnection.stopConnection(undefined) called while in state Disconnecting.
    And as result "Error: Connection disconnected with error 'WebSocket is not in the OPEN state'." is logged in the console.
  2. After 30sec 4th retry attempted and It will automatically reconnected.
    These steps are consistency reproducible.

Please help me to get rid of this error in between 3rd and 4th retry attempt.

@javiercn javiercn added the area-signalr Includes: SignalR clients and servers label Dec 5, 2019
@halter73
Copy link
Member

halter73 commented Dec 5, 2019

I think we need better logging in the WebSocketTransport.ts's WebSocket.onclose handler to see exactly what happened here.

I don't see any log of the WebSocket CloseEvent details, though the logs that exist certainly imply the WebSocket closed shortly after it was open during the hub handshake. Given that the websocket closed during the hub handshake, it's not surprising that it took one more reconnect attempt to successfully reestablish the connection, though it would be nice to know why the WebSocket closed to begin whit.

@paramitamandal can you reliably reproduce the handshake error? If so, is there any trick to make the repro reliable? If you could, putting the following log statement at the top of WebSocketTransport.ts's WebSocket.onclose handler might help diagnose the issue.

https://github.com/aspnet/AspNetCore/blob/f5492be35bc5831e3cd7f394e252690c43bd1c7f/src/SignalR/clients/ts/signalr/src/WebSocketTransport.ts#L110

this.logger.log(LogLevel.Trace, `WebSocket closed with status code: ${event.code} (${event.reason}).`);

@paramitamandal
Copy link
Author

ok. thanks @halter73. This error occurred each and every time when client tries to reconnect with the Server after the server goes down. The first attempt always failed due to the handshaking error, and then in 2nd attempt connection established. I will put a log in webSocket.onclose() and let you know the feedback.

@analogrelay
Copy link
Contributor

We've seen some weird behavior with Angular's use of zone.js here, since it replaces the WebSocket class in JS with a different version. We should do an end-to-end angular test to see what we can find.

@analogrelay
Copy link
Contributor

@bradygaster to prove out the end-to-end

@analogrelay analogrelay added this to the 5.0.0-preview1 milestone Dec 11, 2019
@paramitamandal
Copy link
Author

I have already put some log as suggested by @halter73 but that did not add any other error log other than the previous one.
We are getting this error logged everytime as the first attempt of reconnection. @anurse May be due to angular as you said. I will update you if I find anything other than this log for non angular app during reconnection.

@paramitamandal
Copy link
Author

@anurse there is no luck in non angular project also. same error everywhere.

@analogrelay
Copy link
Contributor

Can you tell us more about your hosting environment?

  • What OS are you on?
  • What server are you using (IIS/Kestrel/HttpSysServer)?
  • Do you have any proxies in between the client and the server (Nginx/Apache/HAproxy/etc.)?

Also, can you capture a network trace? We have documentation on how to do this.

It looks like something is causing the socket to be closed. Unfortunately the browser is somewhat unhelpful in these cases and doesn't give us a detailed error message, just "WebSocket is not in the OPEN state".

@imdevjyoti
Copy link

We are using Win 10, 1809 x64. aprart from this we are using Win7, win 8.1, and server2016 and also other versions of Windows 10.
We are using "Kestrel" and No proxies have been implemented.

@imdevjyoti
Copy link

imdevjyoti commented Dec 16, 2019

The network trace: IEnwtrace1 - Copy.txt

@analogrelay
Copy link
Contributor

Hrm, looks like the WebSockets are very short-lived, but there's no real indication why it's being terminated in that trace (Browser dev tools traces have very limited WebSockets support unfortunately). Are you able to try using Fiddler or a tool like Wireshark? (If you're using Wireshark, it's best if you can do the trace over HTTP, and not HTTPS since decrypting the traffic with Wireshark is very difficult).

Can you also collect server-side trace logs? Looking at the client-side logs and the trace you provided, it appears there isn't anything interrupting the connection there.

@analogrelay analogrelay added the Needs: Author Feedback The author of this issue needs to respond in order for us to continue investigating this issue. label Dec 16, 2019
@imdevjyoti
Copy link

@analogrelay
Copy link
Contributor

If you're using HTTPS you need to configure Fiddler to decrypt HTTPS traffic. All I see in that trace is encrypte TLS traffic and I can't decrypt it :(.

Let me know if you have any more questions on how to assemble the necessary information.

@paramitamandal
Copy link
Author

Here is the traffic captured by fiddler. Please let me know if this helps to investigate further or not.
errorReconnectingIE.txt

@ghost ghost added Needs: Attention 👋 This issue needs the attention of a contributor, typically because the OP has provided an update. and removed Needs: Author Feedback The author of this issue needs to respond in order for us to continue investigating this issue. labels Dec 18, 2019
@analogrelay
Copy link
Contributor

analogrelay commented Dec 18, 2019

How about server-side logs? Have you been able to collect those?

@analogrelay
Copy link
Contributor

I'm not able to find anything in the traces. Is it possible for you to share a runnable sample application that reproduces the problem?

@halter73 halter73 added Needs: Author Feedback The author of this issue needs to respond in order for us to continue investigating this issue. and removed Needs: Attention 👋 This issue needs the attention of a contributor, typically because the OP has provided an update. labels Dec 18, 2019
@halter73
Copy link
Member

I took a look at the fiddler trace too. The following request seems to be the reconnect attempt that failed. The trace shows the request only lasted a second and transferred a few bytes. The last WebSocket request happened 30 seconds later and seems to have stayed open until the end of the capture.

GET https://127.0.0.1:9926/host?id=MWpPeHaWrlIvInzIuzFb4w HTTP/1.1
Origin: http://172.26.95.122:8080
Sec-WebSocket-Key: bUO3pf60pngV4wFz//IFiA==
Connection: Upgrade
Upgrade: websocket
Sec-WebSocket-Version: 13
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko
Host: 127.0.0.1:9926
Cache-Control: no-cache


HTTP/1.1 101 Switching Protocols
Connection: Upgrade
Date: Wed, 18 Dec 2019 09:57:37 GMT
Server: Kestrel
Upgrade: websocket
Vary: Origin
Access-Control-Allow-Credentials: true
Access-Control-Allow-Origin: http://172.26.95.122:8080
Sec-WebSocket-Accept: S82skX5AsL8JE/y2xJrswfNf3FE=
EndTime: 15:27:38.009
ReceivedBytes: 4
SentBytes: 6

I agree with Andrew. It would be super helpful to get a runnable sample that demonstrates the issue. Server-side trace logs might also be helpful.

@viktorgullmark
Copy link

viktorgullmark commented Dec 19, 2019

Just wanted to pinch in: We're experiencing the same behaviour in a react app using @microsoft/signalr 3.0.1. Can share the code later on!

@paramitamandal
Copy link
Author

@halter73 @anurse Can you please let me know the specific requirement about the runnable sample application? We just created a connection between dotnetcore web app and an angular web app with @microsoft/signalr 3.0.1 and experiencing the error everytime while reconnecting.

@ghost ghost added Needs: Attention 👋 This issue needs the attention of a contributor, typically because the OP has provided an update. and removed Needs: Author Feedback The author of this issue needs to respond in order for us to continue investigating this issue. labels Dec 19, 2019
@analogrelay
Copy link
Contributor

@paramitamandal Ideally a git repo we can clone and run (with some steps you provide) to reproduce the issue. We can certainly try to reproduce this ourselves, but there may be a subtle difference that doesn't become clear until we're comparing code. If it's easy for you to set up a repro project then we don't lose a bunch of time to getting the correct scenario identified.

@halter73 halter73 removed the Needs: Attention 👋 This issue needs the attention of a contributor, typically because the OP has provided an update. label Dec 19, 2019
@halter73 halter73 added the Needs: Author Feedback The author of this issue needs to respond in order for us to continue investigating this issue. label Dec 19, 2019
@paramitamandal
Copy link
Author

will give u the repo link soon.

@ghost ghost added Needs: Attention 👋 This issue needs the attention of a contributor, typically because the OP has provided an update. and removed Needs: Author Feedback The author of this issue needs to respond in order for us to continue investigating this issue. labels Dec 20, 2019
@analogrelay analogrelay added Needs: Author Feedback The author of this issue needs to respond in order for us to continue investigating this issue. and removed Needs: Attention 👋 This issue needs the attention of a contributor, typically because the OP has provided an update. labels Dec 20, 2019
@dlimkin
Copy link

dlimkin commented Dec 22, 2019

@halter73 @anurse You can use my repo to reproduce the issue
https://github.com/dlimkin/AspNetCore-issues-17613

@ghost
Copy link

ghost commented Dec 26, 2019

This issue has been automatically marked as stale because it has been marked as requiring author feedback but has not had any activity for 4 days. It will be closed if no further activity occurs within 3 days of this comment. If it is closed, feel free to comment when you are able to provide the additional information and we will re-investigate.

See our Issue Management Policies for more information.

@paramitamandal
Copy link
Author

@halter73 @anurse did u able to look into code which @dlimkin provided?

@ghost ghost added Needs: Attention 👋 This issue needs the attention of a contributor, typically because the OP has provided an update. and removed Needs: Author Feedback The author of this issue needs to respond in order for us to continue investigating this issue. Status: No Recent Activity labels Dec 27, 2019
@paramitamandal
Copy link
Author

@halter73 @anurse Any luck??

@analogrelay analogrelay removed this from the 5.0.0-preview1 milestone Jan 6, 2020
@analogrelay
Copy link
Contributor

@paramitamandal We've been a bit delayed due to the holidays but will take a look soon.

@analogrelay analogrelay added this to the 5.0.0-preview1 milestone Jan 6, 2020
@paramitamandal
Copy link
Author

@anurse no issue..

@yentheo
Copy link

yentheo commented Jan 16, 2020

I have the same issue as described above, the first retry fails immediately with the message 'WebSocket is not in the OPEN state'.
I changed my code to the following:

    // build connection with retry policy
    const connection = new HubConnectionBuilder()
      .withUrl(url, options)
      .withAutomaticReconnect(new InlineRetryPolicy(context => {
        // the first retry fails with this message, but it is unclear why, if I retry immediatly after this, the connection succeeds
        if (context.retryReason.message === 'WebSocket is not in the OPEN state') {
          return 0;
        } else {
          observer.next({ state: HubConnectionState.Disconnected });
          // when callback is called, start emitting state changes with the remaining seconds untill retry
          RETRY_STREAM.pipe(map(x => ({ state: HubConnectionState.Reconnecting, seconds: RETRY_IN_SECONDS - x - 1 }))).subscribe(x => observer.next(x));
          // return the amount of milliseconds the signalr library has to wait
          return RETRY_IN_SECONDS * 1000;
        }
      }))
      .build();

So when that is the fail message, I reconnect directly, otherwise I do my normal retry logic, I don't like hacks/fixes like this, but I don't really know where the issue lies

@BrennanConroy BrennanConroy added bug This issue describes a behavior which is not expected - a bug. and removed Needs: Attention 👋 This issue needs the attention of a contributor, typically because the OP has provided an update. labels Jan 21, 2020
@BrennanConroy
Copy link
Member

Thanks, we can reproduce the issue and will look into the problem.

@ghost ghost locked as resolved and limited conversation to collaborators Feb 23, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-signalr Includes: SignalR clients and servers bug This issue describes a behavior which is not expected - a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants