Make the key ring cache expire shortly after an immediately-activated key is generated #54517

amcasey · 2024-03-12T20:46:19Z

If a key is required and none is available, one will be generated and immediately activated. Immediately-activated keys are problematic because they may be used before they have propagated to other app instances. What's more, in a scenario where one instance needs to generate an immediately-activated key, it's likely that other instances will need to do the same. A given instance has no control over when other instances pick up its immediately-activated keys, but it can do its best to pick up theirs. Therefore, every time it generates an immediately-activated key, it assumes other instances may have as well and pre-emptively schedules a refresh of its own key ring cache.

This problem is already partially (and complementarily) addressed by KeyRingProvider.InAutoRefreshWindow, which allows missing keys to be re-fetched from the backing repository for two minutes after app-startup. This will cover the case where the short expiration period is too short to catch all updates but doesn't help with cases where an immediately-activated key is required after initial startup.

Sadly, this still does not completely address the issue: instances could still use keys unknown to their peers during the 15 second refresh window. To fix the issue properly, we'd need to know the order in which the keys had been persisted to the repository.

Part of #52678, which is part of #36157
Should benefit from #54490

… key is generated If a key is required and none is available, one will be generated and immediately activated. Immediately-activated keys are problematic because they may be used before they have propagated to other app instances. What's more, in a scenario where one instance needs to generate an immediately-activated key, it's likely that other instances will need to do the same. A given instance has no control over when other instances pick up its immediately-activated keys, but it can do its best to pick up theirs. Therefore, every time it generates an immediately-activated key, it assumes other instances may have as well and pre-emptively schedules a refresh of its own key ring cache. This problem is already partially (and complementarily) addressed by `KeyRingProvider.InAutoRefreshWindow`, which allows missing keys to be re-fetched from the backing repository for two minutes after app-startup. This will cover the case where the short expiration period is too short to catch all updates but doesn't help with cases where an immediately-activated key is required after initial startup. Sadly, this still does not completely address the issue: instances could still use keys unknown to their peers during the 15 second refresh window. To fix the issue properly, we'd need to know the order in which the keys had been persisted to the repository. Part of dotnet#52678, which is part of dotnet#36157

src/DataProtection/DataProtection/src/KeyManagement/KeyManagementOptions.cs

...tection/test/Microsoft.AspNetCore.DataProtection.Tests/KeyManagement/KeyRingProviderTests.cs

amcasey · 2024-03-21T21:58:56Z

@adityamandaleeka I'm tempted to add an appcontext switch for this. Thoughts?

amcasey · 2024-03-21T23:03:07Z

src/DataProtection/DataProtection/src/KeyManagement/KeyRingProvider.cs

+            // (in which case, other instances may have done the same)
+            ? (generatedKey.ActivationDate < now + KeyManagementOptions.KeyPropagationWindow) // No clock skew on a key we generated
+            // Or we selected a key that has yet to propagate (presumably, from another instance)
+            : (defaultKey.CreationDate > now - KeyManagementOptions.KeyPropagationWindow);


Can this happen in a loop? What if this is still the default after the next refresh?

Indeed it can. 😢

amcasey · 2024-03-21T23:07:14Z

Idea: what if, instead of refreshing the key ring, we entered/extended an auto-refresh window.

We'd have no guarantee of an unknown key triggering a refresh within the window, so probably not ideal.

adityamandaleeka · 2024-03-21T23:15:59Z

@adityamandaleeka I'm tempted to add an appcontext switch for this. Thoughts?

Sounds good to me. Under a "new data protection" switch that includes other stuff I assume? Feels like a separate switch for this might be overkill.

We don't want to refresh every 15 seconds until the propagation window elapses. Storing the flag on the CacheableKeyRing simplifies thread-safe state management, but requires changes to some public test hooks. This version of the change should be source- and binary-compatible with previous versions of .net.

...to parallel IInternalXmlKeyManager.

captainsafia · 2024-03-26T22:46:20Z

src/DataProtection/DataProtection/src/KeyManagement/KeyManagementOptions.cs

+    /// of its own key ring cache.  This property controls how long after an immediately-
+    /// activated key is generated the key ring cache will be refreshed.
+    /// </summary>
+    internal static TimeSpan ShortKeyRingRefreshPeriod { get; set; } = TimeSpan.FromSeconds(15);


Any guidance here on what the minimum refresh period should be or why 15 seconds is a sane default?

captainsafia · 2024-03-26T22:50:53Z

src/DataProtection/DataProtection/src/KeyManagement/KeyRingProvider.cs

+        // of downtime may discover that it has a valid, but soon-to-be-expired key. The replacement will not
+        // be immediately-activated, but may be activated before it has propagated.
+        var useShortRefreshPeriod = allowShortRefreshPeriod &&
+            (generatedKey is not null


Nit: Move the second clause of this Boolean to a well-defined method (e.g. GeneratedKeyWithinPropogationWindow or something).

captainsafia · 2024-03-26T22:52:12Z

src/DataProtection/DataProtection/src/KeyManagement/KeyRingProvider.cs

@@ -241,7 +262,7 @@ internal IKeyRing GetCurrentKeyRingCore(DateTime utcNow, bool forceRefresh = fal

                try
                {
-                    newCacheableKeyRing = CacheableKeyRingProvider.GetCacheableKeyRing(utcNow);
+                    newCacheableKeyRing = CacheableKeyRingProvider.GetCacheableKeyRing(utcNow, allowShortRefreshPeriod: existingCacheableKeyRing?.HasShortRefreshPeriod != true);


What does existingCacheableKeyRing?.HasShortRefreshPeriod != true functionally mean here?

It's a nullable bool

AppContext switch here?

amcasey · 2024-04-12T19:26:38Z

In the absence of races (e.g. as in the deterministic simulator), this provides essentially no benefit and slightly increases the amount of work done (e.g. network calls). Since races tend to occur on startup and we already have a startup mitigation (the auto-reload period), this change probably does not justify the complexity it adds.

amcasey · 2024-04-12T19:27:33Z

@captainsafia Your feedback is still welcome, but this is now low priority.

ghost added the area-dataprotection Includes: DataProtection label Mar 12, 2024

adityamandaleeka reviewed Mar 19, 2024

View reviewed changes

src/DataProtection/DataProtection/src/KeyManagement/KeyManagementOptions.cs Outdated Show resolved Hide resolved

adityamandaleeka approved these changes Mar 19, 2024

View reviewed changes

amcasey commented Mar 19, 2024

View reviewed changes

...tection/test/Microsoft.AspNetCore.DataProtection.Tests/KeyManagement/KeyRingProviderTests.cs Outdated Show resolved Hide resolved

amcasey requested a review from captainsafia March 21, 2024 21:58

amcasey commented Mar 21, 2024

View reviewed changes

amcasey added 2 commits March 21, 2024 16:13

Move ShortKeyRingRefreshPeriod near KeyRingRefreshPeriod

26411c1

Fix typo

69ab2a8

amcasey added 4 commits March 21, 2024 16:21

Update test to assert we don't get stuck in a short-refresh loop

5a6af04

Decouple ICacheableKeyRingProvider2 from ICacheableKeyRingProvider

be322f0

Rename ICacheableKeyRingProvider2 to IInternalCacheableKeyRingProvider

5daf76f

...to parallel IInternalXmlKeyManager.

amcasey mentioned this pull request Mar 25, 2024

Make types in Microsoft.AspNetCore.DataProtection.KeyManagement.Internal internal #54712

Closed

Merge branch 'main' into StartupRace

e7bf0ba

captainsafia reviewed Mar 26, 2024

View reviewed changes

dotnet-policy-service bot added the pending-ci-rerun When assigned to a PR indicates that the CI checks should be rerun label Apr 8, 2024

amcasey marked this pull request as draft April 12, 2024 19:26

amcasey closed this May 2, 2024

dotnet-policy-service bot added this to the 9.0-preview5 milestone May 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Make the key ring cache expire shortly after an immediately-activated key is generated #54517

Make the key ring cache expire shortly after an immediately-activated key is generated #54517

Uh oh!

amcasey commented Mar 12, 2024

Uh oh!

Uh oh!

Uh oh!

amcasey commented Mar 21, 2024

Uh oh!

amcasey Mar 21, 2024

Uh oh!

amcasey Mar 21, 2024

Uh oh!

amcasey commented Mar 21, 2024

Uh oh!

adityamandaleeka commented Mar 21, 2024

Uh oh!

captainsafia Mar 26, 2024

Uh oh!

captainsafia Mar 26, 2024

Uh oh!

captainsafia Mar 26, 2024

Uh oh!

amcasey Apr 1, 2024

Uh oh!

amcasey Apr 1, 2024

Uh oh!

amcasey commented Apr 12, 2024

Uh oh!

amcasey commented Apr 12, 2024

Uh oh!

Uh oh!

Make the key ring cache expire shortly after an immediately-activated key is generated #54517

Make the key ring cache expire shortly after an immediately-activated key is generated #54517

Uh oh!

Conversation

amcasey commented Mar 12, 2024

Uh oh!

Uh oh!

Uh oh!

amcasey commented Mar 21, 2024

Uh oh!

amcasey Mar 21, 2024

Choose a reason for hiding this comment

Uh oh!

amcasey Mar 21, 2024

Choose a reason for hiding this comment

Uh oh!

amcasey commented Mar 21, 2024

Uh oh!

adityamandaleeka commented Mar 21, 2024

Uh oh!

captainsafia Mar 26, 2024

Choose a reason for hiding this comment

Uh oh!

captainsafia Mar 26, 2024

Choose a reason for hiding this comment

Uh oh!

captainsafia Mar 26, 2024

Choose a reason for hiding this comment

Uh oh!

amcasey Apr 1, 2024

Choose a reason for hiding this comment

Uh oh!

amcasey Apr 1, 2024

Choose a reason for hiding this comment

Uh oh!

amcasey commented Apr 12, 2024

Uh oh!

amcasey commented Apr 12, 2024

Uh oh!

Uh oh!