-
Notifications
You must be signed in to change notification settings - Fork 10.3k
Make the key ring cache expire shortly after an immediately-activated key is generated #54517
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… key is generated If a key is required and none is available, one will be generated and immediately activated. Immediately-activated keys are problematic because they may be used before they have propagated to other app instances. What's more, in a scenario where one instance needs to generate an immediately-activated key, it's likely that other instances will need to do the same. A given instance has no control over when other instances pick up its immediately-activated keys, but it can do its best to pick up theirs. Therefore, every time it generates an immediately-activated key, it assumes other instances may have as well and pre-emptively schedules a refresh of its own key ring cache. This problem is already partially (and complementarily) addressed by `KeyRingProvider.InAutoRefreshWindow`, which allows missing keys to be re-fetched from the backing repository for two minutes after app-startup. This will cover the case where the short expiration period is too short to catch all updates but doesn't help with cases where an immediately-activated key is required after initial startup. Sadly, this still does not completely address the issue: instances could still use keys unknown to their peers during the 15 second refresh window. To fix the issue properly, we'd need to know the order in which the keys had been persisted to the repository. Part of dotnet#52678, which is part of dotnet#36157
src/DataProtection/DataProtection/src/KeyManagement/KeyManagementOptions.cs
Outdated
Show resolved
Hide resolved
...tection/test/Microsoft.AspNetCore.DataProtection.Tests/KeyManagement/KeyRingProviderTests.cs
Outdated
Show resolved
Hide resolved
@adityamandaleeka I'm tempted to add an appcontext switch for this. Thoughts? |
// (in which case, other instances may have done the same) | ||
? (generatedKey.ActivationDate < now + KeyManagementOptions.KeyPropagationWindow) // No clock skew on a key we generated | ||
// Or we selected a key that has yet to propagate (presumably, from another instance) | ||
: (defaultKey.CreationDate > now - KeyManagementOptions.KeyPropagationWindow); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this happen in a loop? What if this is still the default after the next refresh?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed it can. 😢
Idea: what if, instead of refreshing the key ring, we entered/extended an auto-refresh window. We'd have no guarantee of an unknown key triggering a refresh within the window, so probably not ideal. |
Sounds good to me. Under a "new data protection" switch that includes other stuff I assume? Feels like a separate switch for this might be overkill. |
We don't want to refresh every 15 seconds until the propagation window elapses. Storing the flag on the CacheableKeyRing simplifies thread-safe state management, but requires changes to some public test hooks. This version of the change should be source- and binary-compatible with previous versions of .net.
...to parallel IInternalXmlKeyManager.
/// of its own key ring cache. This property controls how long after an immediately- | ||
/// activated key is generated the key ring cache will be refreshed. | ||
/// </summary> | ||
internal static TimeSpan ShortKeyRingRefreshPeriod { get; set; } = TimeSpan.FromSeconds(15); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any guidance here on what the minimum refresh period should be or why 15 seconds is a sane default?
// of downtime may discover that it has a valid, but soon-to-be-expired key. The replacement will not | ||
// be immediately-activated, but may be activated before it has propagated. | ||
var useShortRefreshPeriod = allowShortRefreshPeriod && | ||
(generatedKey is not null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Move the second clause of this Boolean to a well-defined method (e.g. GeneratedKeyWithinPropogationWindow
or something).
@@ -241,7 +262,7 @@ internal IKeyRing GetCurrentKeyRingCore(DateTime utcNow, bool forceRefresh = fal | |||
|
|||
try | |||
{ | |||
newCacheableKeyRing = CacheableKeyRingProvider.GetCacheableKeyRing(utcNow); | |||
newCacheableKeyRing = CacheableKeyRingProvider.GetCacheableKeyRing(utcNow, allowShortRefreshPeriod: existingCacheableKeyRing?.HasShortRefreshPeriod != true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does existingCacheableKeyRing?.HasShortRefreshPeriod != true
functionally mean here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a nullable bool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AppContext switch here?
In the absence of races (e.g. as in the deterministic simulator), this provides essentially no benefit and slightly increases the amount of work done (e.g. network calls). Since races tend to occur on startup and we already have a startup mitigation (the auto-reload period), this change probably does not justify the complexity it adds. |
@captainsafia Your feedback is still welcome, but this is now low priority. |
If a key is required and none is available, one will be generated and immediately activated. Immediately-activated keys are problematic because they may be used before they have propagated to other app instances. What's more, in a scenario where one instance needs to generate an immediately-activated key, it's likely that other instances will need to do the same. A given instance has no control over when other instances pick up its immediately-activated keys, but it can do its best to pick up theirs. Therefore, every time it generates an immediately-activated key, it assumes other instances may have as well and pre-emptively schedules a refresh of its own key ring cache.
This problem is already partially (and complementarily) addressed by
KeyRingProvider.InAutoRefreshWindow
, which allows missing keys to be re-fetched from the backing repository for two minutes after app-startup. This will cover the case where the short expiration period is too short to catch all updates but doesn't help with cases where an immediately-activated key is required after initial startup.Sadly, this still does not completely address the issue: instances could still use keys unknown to their peers during the 15 second refresh window. To fix the issue properly, we'd need to know the order in which the keys had been persisted to the repository.
Part of #52678, which is part of #36157
Should benefit from #54490