Skip to content

Prevent WebSockets from throwing during graceful shutdown #26914

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Oct 17, 2020

Conversation

halter73
Copy link
Member

@halter73 halter73 commented Oct 14, 2020

This stops WebSockets from throwing as soon as Kestrel shutdown starts instead of after the ShutdownTimeout. This copies logic we already have in other places like Http1ContentLengthMessageBody.

Kestrel will call Transport.Input.CancelPendingRead() during shutdown so idle connections can wake up and shutdown gracefully. We manually call CancelPendingRead() to simulate this and ensure the Stream returned by UpgradeAsync doesn't throw in this case.

Addresses #26482

readResult = await _context.Input.ReadAsync(cancellationToken);
}

return readResult;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephentoub Is this acceptable ValueTask usage?

I think we're okay because this always either awaits the ValueTask or calls GetAwaiter().GetResult(). And it only calls GetAwaiter().GetResult() if we've verified ValueTask.IsCompletedSuccessfully. I know we don't pool ValueTasks yet, but I don't want to introduce more bad code like what you highlighted in #16876.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should turn on the analyzer

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine. And the analyzer shouldn't warn about it; if it does, we should fix the analysis.

@halter73 halter73 changed the title Fix canceled ReadResult handling in Http1UpgradeMessageBody Prevent WebSockets from throwing during graceful shutdown Oct 15, 2020
// can wake up and shutdown gracefully. We manually call CancelPendingRead() to simulate this and
// ensure the Stream returned by UpgradeAsync doesn't throw in this case.
// https://github.com/dotnet/aspnetcore/issues/26482
var connectionTransportFeature = context.Features.Get<IConnectionTransportFeature>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clever

// Ignore the canceled readResult unless it was canceled by the user.
do
{
if (!_context.Input.TryRead(out readResult))
Copy link
Member

@davidfowl davidfowl Oct 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this end up in a tight loop until cancellation is consumed by the request processing loop?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if the PipeReader repeatedly returns a canceled ReadResult. Same goes ReadAsync. It should still be safe, because PipeReaders are supposed to only return at most 1 canceled ReadResult per call to CancelPendingRead(). And this logic isn't entirely new, even in 3.1. Http1ContentLengthMessageBody already has this behavior.

One difference you'll notice though between this and Http1ContentLengthMessageBody is that it doesn't call AdvanceTo before reading again after a canceled ReadResult (Remember #25799). This isn't necessary for the DefaultPipeReader and simplifies the logic quite a bit, but I could easily seeing this breaking custom ones. That said, if it breaks because of this, it would wind up just throwing an InvalidOperationException instead of the OperationCanceledException it throws today.

I understand being nervous though. Given severity of the issue (which is basically throwing from WebSockets 5 seconds earlier than we should during shutdown), do you still think this is worth backporting to 3.1?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we should still back port it.

@jkotalik
Copy link
Contributor

Should you make a PR to 5.0 as well?

@halter73 halter73 added the Servicing-consider Shiproom approval is required for the issue label Oct 16, 2020
@ghost
Copy link

ghost commented Oct 16, 2020

Hello human! Please make sure you've included the Shiproom Template in a comment or (preferably) the PR description. Also, make sure this PR is not marked as a draft and is ready-to-merge.

@halter73
Copy link
Member Author

halter73 commented Oct 16, 2020

Description

This prevents WebSockets from throwing during graceful shutdown.

Kestrel calls Transport.Input.CancelPendingRead() during shutdown so idle connections can wake up and close gracefully. This PR ignores canceled ReadResults in Streams returned by UpgradeAsync, so they don't throw early during shutdown.

Customer Impact

Without this fix, WebSocket read operations throw an OperationCanceledException at the beginning graceful shutdown (which lasts 5 seconds by default). WebSocket read operations should only throw if the graceful shutdown period expires to give customers the opportunity to use the WebSocket during the graceful shutdown period.

This was reported by a customer on GitHub. #26482

Regression?

No.

Risk

Low. It might seem a bit scary because we're adding a loop with a non-trivial exit condition, but we already have loops exactly like this in 3.1 in the more commonly used Http1ContentLengthMessageBody. See #26914 (comment).

@Pilchie Pilchie added Servicing-approved Shiproom has approved the issue and removed Servicing-consider Shiproom approval is required for the issue labels Oct 16, 2020
@Pilchie
Copy link
Member

Pilchie commented Oct 16, 2020

This was approved for 3.1.10 over email. Tagging @wtgodbe for merge.

@Tratcher
Copy link
Member

@halter73 This issue still needs to be fixed in 5.0 and master, right?

@Pilchie what should we do for 5.0?

@Pilchie
Copy link
Member

Pilchie commented Oct 16, 2020

@wtgodbe the merge bot is still running right? In that case, I think we should just merge it, and treat this as implicitly approved for 5.0.

@halter73
Copy link
Member Author

The merge bot was still running when I was on build ops a few days ago. @dougbu just mentioned in ASP.NET Build Team that the merge bot isn't adding much value, but I think it should work for this change.

@wtgodbe wtgodbe merged commit 374f717 into release/3.1 Oct 17, 2020
@wtgodbe wtgodbe deleted the halter73/26482 branch October 17, 2020 00:06
@wtgodbe
Copy link
Member

wtgodbe commented Oct 17, 2020

Yeah, the bot is still merging 3.1 -> 5.0

@amcasey amcasey added the area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions label Jun 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions feature-kestrel Servicing-approved Shiproom has approved the issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants