Skip to content

Change how ANCM recycles app #52807

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Apr 22, 2024
Merged

Change how ANCM recycles app #52807

merged 11 commits into from
Apr 22, 2024

Conversation

BrennanConroy
Copy link
Member

@BrennanConroy BrennanConroy commented Dec 13, 2023

Fix for #41340

Issue summary:

When an app is recycling (config change, new deployment, regularly scheduled recycle, resource utilization rule recycle, etc.) IIS will notify our native module and we will initiate a shutdown of the managed application. While we're shutting down we reject any new requests we get with a 503. IIS has a feature (enabled by default) called overlapped recycle, where it will start a new instance of the application while the old one is shutting down, and it will queue then send incoming requests to the new app. In ASP.NET this worked well and 503's would not occur during overlapped recycles. In ASP.NET Core this didn't work and incoming requests weren't being sent to the new app even though it was started.

Fix summary:

I discovered that if you blocked the OnGlobalStopListening method, which is how IIS tells us it will stop sending us requests, then we will continue to get new requests until we complete the method. And since we were calling shutdown inline in response to that method being called we were blocking until the managed app shutdown and during shutdown we would reject incoming requests with 503. So the fix was to run shutdown on a thread and exit OnGlobalStopListening ASAP. This lets IIS stop sending us requests and queue and send them to the new app while we shut down the managed application.

The fix also includes a default delay of 1 second before we call shutdown in order to reduce any races between IIS actually stopping sending us requests after OnGlobalStopListening completes and us starting shut down which is when we would start sending 503's. The delay is configurable via shutdownDelay in the web config, or ANCM_shutdownDelay environment variable (both in milliseconds). I'm sort of leaning towards delayShutdown as a name instead, we can bike shed on that 😃

There is a second fix that was found while trying to fix the original issue. By listening to GL_APPLICATION_STOP we can now properly detect "Apps with preload + always running that don't receive a request before recycle/shutdown" and IISExpress app shutdown. Issue #28089 and can't find the one about IISExpress 🤷

@ghost ghost added the area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions label Dec 13, 2023
@BrennanConroy BrennanConroy added the feature-iis Includes: IIS, ANCM label Dec 13, 2023
Copy link
Member

@amcasey amcasey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your explanation makes sense and your code seems to match your explanation. I had a bunch of questions and suggestions though.

void StartShutdown()
{
// Shutdown has already been started
if (m_shutdown.joinable())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this stay true once the thread completes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be true if the thread is still running or if the thread completed but hasn't had .join() called on it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume some external factor preverts StartShutdown from being called again after the thread has been joined?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, both places StartShutdown is called check g_fInShutdown which is set from the thread in StartShutdown. I think putting a check for g_fInShutdown in StartShutdown would be a good idea though even if it's redundant.

void ShimOptions::SetShutdownDelay(const std::wstring& shutdownDelay)
{
auto millsecondsValue = std::stoi(shutdownDelay);
if (millsecondsValue < 0)
Copy link
Member

@captainsafia captainsafia Apr 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not an expert but does a delay of 0 in the sleep_for below behave differently than a positive non-zero delay?

Edit: Actually, it looks like 0 might mean fallback to the old behavior?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah 0 is fallback. Also 0 in sleep_for will just return and not sleep.

Copy link
Member

@amcasey amcasey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I lost track of this one. Thanks for the changes!

if (fServerInitiated)
// Ignore fServerInitiated for IISExpress
// Recycle doesn't do anything in IISExpress, we need to explicitly shutdown
if (m_pHttpServer.IsCommandLineLaunch())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: if this is the canonical way of checking whether we're under IISExpress, consider making it a helper function somewhere.

Copy link
Member Author

@BrennanConroy BrennanConroy Apr 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't confidently say it's the check for IISExpress. But that seems to be what we do elsewhere in the project.

@BrennanConroy BrennanConroy merged commit 42bc1a1 into main Apr 22, 2024
26 checks passed
@BrennanConroy BrennanConroy deleted the brecon/recycle branch April 22, 2024 18:24
@dotnet-policy-service dotnet-policy-service bot added this to the 9.0-preview4 milestone Apr 22, 2024
@BrennanConroy
Copy link
Member Author

/backport to release/8.0

Copy link
Contributor

Started backporting to release/8.0: https://github.com/dotnet/aspnetcore/actions/runs/8789089434

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions feature-iis Includes: IIS, ANCM
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants