-
Notifications
You must be signed in to change notification settings - Fork 10.3k
Meta Issue - Data Protection keyring sync/creation race investigation #36157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Add Data Protection Race condition. |
Thanks for contacting us. We're moving this issue to the |
The milestone should probably be updated again |
… key is generated If a key is required and none is available, one will be generated and immediately activated. Immediately-activated keys are problematic because they may be used before they have propagated to other app instances. What's more, in a scenario where one instance needs to generate an immediately-activated key, it's likely that other instances will need to do the same. A given instance has no control over when other instances pick up its immediately-activated keys, but it can do its best to pick up theirs. Therefore, every time it generates an immediately-activated key, it assumes other instances may have as well and pre-emptively schedules a refresh of its own key ring cache. This problem is already partially (and complementarily) addressed by `KeyRingProvider.InAutoRefreshWindow`, which allows missing keys to be re-fetched from the backing repository for two minutes after app-startup. This will cover the case where the short expiration period is too short to catch all updates but doesn't help with cases where an immediately-activated key is required after initial startup. Sadly, this still does not completely address the issue: instances could still use keys unknown to their peers during the 15 second refresh window. To fix the issue properly, we'd need to know the order in which the keys had been persisted to the repository. Part of dotnet#52678, which is part of dotnet#36157
This code is trying to ensure that the selected key can be decrypted (i.e. is usable). It may fail if, for example, Azure KeyVault is unreachable due to connectivity issues. If it fails, there's a log message and then an immediately-activated key will be generated. An immediately-activated key can cause problems for sessions making requests to multiple app instances and those problems won't obviously be connected to the (almost silent) failure in CanCreateAuthenticatedEncryptor. Rather than effectively swallowing such errors, we should allow some retries. Part of dotnet#52678, which is part of dotnet#36157
This code is trying to ensure that the selected key can be decrypted (i.e. is usable). It may fail if, for example, Azure KeyVault is unreachable due to connectivity issues. If it fails, there's a log message and then an immediately-activated key will be generated. An immediately-activated key can cause problems for sessions making requests to multiple app instances and those problems won't obviously be connected to the (almost silent) failure in CanCreateAuthenticatedEncryptor. Rather than effectively swallowing such errors, we should allow some retries. Part of dotnet#52678, which is part of dotnet#36157
…54711) * Allow retries in DefaultKeyResolver.CanCreateAuthenticatedEncryptor This code is trying to ensure that the selected key can be decrypted (i.e. is usable). It may fail if, for example, Azure KeyVault is unreachable due to connectivity issues. If it fails, there's a log message and then an immediately-activated key will be generated. An immediately-activated key can cause problems for sessions making requests to multiple app instances and those problems won't obviously be connected to the (almost silent) failure in CanCreateAuthenticatedEncryptor. Rather than effectively swallowing such errors, we should allow some retries. Part of #36157 * Roll our own Lazy that allows resets Retries against the actual `Key` type weren't working because the exception was getting cached in the key's lazy descriptor. Implement our own simple lazy and expose a method for clearing the cached value and exception.
This should be substantially better in 9.0 Preview 4. If you continue to see key-not-found errors in Preview 4, please tag me. |
I'm going to close this on the (admittedly optimistic assumption) that things will be better in 9.0. If that's not the case, we can open a new issue. |
Changed my mind - I think it's better to have a sink for related requests so I'm going to close the child issues instead and point people here. |
Meta-issue to look at the ongoing problems around data protection key sync/race on new keyring.
The text was updated successfully, but these errors were encountered: