-
Notifications
You must be signed in to change notification settings - Fork 10.3k
WebSocket does not honor ShutDownTimeout once SIGTERM/Ctrl+C is issued #26482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@davidfowl curious how signalr handles this. Since upgrading to .net 3.1 this is affecting our service availability. |
I'm guessing we didn't plumb the call to public static IHostBuilder CreateHostBuilder(string[] args) =>
Host.CreateDefaultBuilder(args)
.ConfigureServices(services =>
{
services.Configure<HostOptions>(o => o.ShutdownTimeout = TimeSpan.FromSeconds(30));
})
.ConfigureWebHostDefaults(webBuilder =>
{
webBuilder.UseStartup<Startup>();
}); |
We hit a different exception doing On net5.0, as soon as SIGTERM is issued, the immediate WebSocket read fails with a cancellation exception.
On netcore3.1, all reads after SIGTERM keep failing with a slightly different cancelled exception.
|
I can reproduce. |
It looks like we introduced this WebSocket shutdown bug when we pipeified our request bodies. For normal HTTP/1.1 Content-Length and chunked request bodies, we have logic that ignores canceled ReadResults from the connection-level PipeReader and re-reads internally, if aspnetcore/src/Servers/Kestrel/Core/src/Internal/Http/Http1ContentLengthMessageBody.cs Lines 103 to 108 in acf1dc4
This is important, because the connection-level PipeReader will return canceled ReadResults at the start of server shutdown. This is because Kestrel's graceful shutdown logic calls CancelPendingRead() on very connection-level PipeReader to wake up idle connections' request processing loops so they can exit gracefully if they are between requests. If the connection is in the middle of the request, which is the case with an active WebSocket from Kestrel's perspective, the canceled ReadResult should just be ignored like it is for normal request bodies. Kestrel will still close the connection gracefully if the middleware exits before the ShutdownTimeout. Unfortunately, the Http1UpgradeMessageBody that backs WebSockets does not have the same reread-if-the-ReadResult-was-canceled-by-something-other-than-the-application logic the normal request bodies do. aspnetcore/src/Servers/Kestrel/Core/src/Internal/Http/Http1UpgradeMessageBody.cs Lines 26 to 30 in acf1dc4
This in turn causes HttpRequestStream to throw an OperationCanceledException: aspnetcore/src/Servers/Kestrel/Core/src/Internal/Http/HttpRequestStream.cs Lines 120 to 123 in acf1dc4
But since PipeReader.CancelPendingRead() only causes the current or next call to PipeReader.ReadAsync() to return a canceled ReadResult and not any subsequent reads, this only gets thrown once. It is unusual to try to continue reading after observing an exception, but this is still a bug. It might make sense for HttpRequestStream.ReadAsync() to throw an even scarier exception since it should never observe a canceled ReadResult unless someone called RequestBodyPipe.CancelPendingRead() while also reading from the request Body Stream which would be very odd. |
@Pilchie This is something we want to patch in 3.1 (and 5.0 and 6.0) |
I put it in the |
The impact to customers is that WebSocket.ReadAsync() throws an OperationCanceledException as soon as Kestrel starts shutting down rather than throwing after HostOptions.ShutdownTimeout (which defaults to 5 seconds) expires like a normal HTTP request body does. We think this does not cause a problem for SignalR because SignalR registers with the ApplicationStopping token and immediately closes its connections once the token fires. For some apps, closing the WebSocket immediately after ApplicationStopping fires or after observing the OperationCanceledException might be OK. aspnetcore/src/SignalR/common/Http.Connections/src/Internal/HttpConnectionManager.cs Line 43 in 96c082f
For apps that want to still read from the WebSocket after shutdown has initiated, swallowing the first OperationCanceledException is the only option. This exception is transient, so the next read should succeed if more data arrives before the ShutdownTimeout expires as weird as that is. |
The fix is going to be to not throw from WebSocket.ReadAsync() until the ShutdownTimeout expires at which point the underlying socket will be closed and WebSocket.ReadAsync() will repeatedly throw. You'll still be able to use ApplicationStopping to start gracefully closing the connection when Kestrel starts shutting down. |
Just a note that we'll need to move fairly quickly if we want this for November servicing. |
@halter73 Now this can be closed, right? |
Yep. |
Thanks every one for fixing this. We will verify once its available. |
Description
Once a SIGTERM is issued, the host should be allowed the opportunity to gracefully drain and shutdown all existing http and WebSocket connections.
This used to work properly on netcore2.2, however we observed that since netcore3.1 this functionality is broken for WebSockets. This also remains broken on net5.0.
To Reproduce
Fully baked samples for netcore2.2, netcore3.1 and net5.0 can be found here.
https://github.com/RaviPidaparthi/PlayGround/tree/master/WebsocketGracefulShutdown
To repro
For netcore2.2 service, the the WebSocket will continue processing/reading messages until 30sec after shutdown is initiated.
For netcore3.1 and net5.0 services, the WebSocket read operation throws with below error as soon as shutdown is initiated.
Code snippets
Configure shutdown timeout
Standup a Websocket endpoint
Create a client
Exceptions
Further technical details
ASP.NET Core version=net5.0
Visual Studio Version 16.8.0 Preview 3.2
The text was updated successfully, but these errors were encountered: