Skip to content

Meta Issue - Data Protection keyring sync/creation race investigation #36157

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
blowdart opened this issue Sep 3, 2021 · 6 comments
Open
Labels
area-dataprotection Includes: DataProtection

Comments

@blowdart blowdart added area-dataprotection Includes: DataProtection severity-major This label is used by an internal tool labels Sep 3, 2021
@blowdart blowdart added this to the .NET 7 Planning milestone Sep 3, 2021
@Tanmaybhujade
Copy link

Add Data Protection Race condition.

@ghost
Copy link

ghost commented Sep 9, 2022

Thanks for contacting us.

We're moving this issue to the .NET 8 Planning milestone for future evaluation / consideration. We would like to keep this around to collect more feedback, which can help us with prioritizing this work. We will re-evaluate this issue, during our next planning meeting(s).
If we later determine, that the issue has no community involvement, or it's very rare and low-impact issue, we will close it - so that the team can focus on more important and high impact issues.
To learn more about what to expect next and how this issue will be handled you can read more about our triage process here.

@wtgodbe wtgodbe removed their assignment Nov 28, 2022
@captainsafia captainsafia added the Needs: Design This issue requires design work before implementating. label Dec 13, 2022
@filipw
Copy link
Contributor

filipw commented Dec 1, 2023

The milestone should probably be updated again

@amcasey amcasey self-assigned this Jan 16, 2024
@amcasey amcasey modified the milestones: .NET 9 Planning, 9.0.0 Jan 26, 2024
amcasey added a commit to amcasey/aspnetcore that referenced this issue Mar 12, 2024
… key is generated

If a key is required and none is available, one will be generated and immediately activated.  Immediately-activated keys are problematic because they may be used before they have propagated to other app instances.  What's more, in a scenario where one instance needs to generate an immediately-activated key, it's likely that other instances will need to do the same.  A given instance has no control over when other instances pick up its immediately-activated keys, but it can do its best to pick up theirs.  Therefore, every time it generates an immediately-activated key, it assumes other instances may have as well and pre-emptively schedules a refresh of its own key ring cache.

This problem is already partially (and complementarily) addressed by `KeyRingProvider.InAutoRefreshWindow`, which allows missing keys to be re-fetched from the backing repository for two minutes after app-startup.  This will cover the case where the short expiration period is too short to catch all updates but doesn't help with cases where an immediately-activated key is required after initial startup.

Sadly, this still does not completely address the issue: instances could still use keys unknown to their peers during the 15 second refresh window.  To fix the issue properly, we'd need to know the order in which the keys had been persisted to the repository.

Part of dotnet#52678, which is part of dotnet#36157
amcasey added a commit to amcasey/aspnetcore that referenced this issue Mar 23, 2024
This code is trying to ensure that the selected key can be decrypted (i.e. is usable).  It may fail if, for example, Azure KeyVault is unreachable due to connectivity issues.  If it fails, there's a log message and then an immediately-activated key will be generated.  An immediately-activated key can cause problems for sessions making requests to multiple app instances and those problems won't obviously be connected to the (almost silent) failure in CanCreateAuthenticatedEncryptor.  Rather than effectively swallowing such errors, we should allow some retries.

Part of dotnet#52678, which is part of dotnet#36157
amcasey added a commit to amcasey/aspnetcore that referenced this issue Apr 12, 2024
This code is trying to ensure that the selected key can be decrypted (i.e. is usable).  It may fail if, for example, Azure KeyVault is unreachable due to connectivity issues.  If it fails, there's a log message and then an immediately-activated key will be generated.  An immediately-activated key can cause problems for sessions making requests to multiple app instances and those problems won't obviously be connected to the (almost silent) failure in CanCreateAuthenticatedEncryptor.  Rather than effectively swallowing such errors, we should allow some retries.

Part of dotnet#52678, which is part of dotnet#36157
amcasey added a commit that referenced this issue Apr 19, 2024
…54711)

* Allow retries in DefaultKeyResolver.CanCreateAuthenticatedEncryptor

This code is trying to ensure that the selected key can be decrypted (i.e. is usable).  It may fail if, for example, Azure KeyVault is unreachable due to connectivity issues.  If it fails, there's a log message and then an immediately-activated key will be generated.  An immediately-activated key can cause problems for sessions making requests to multiple app instances and those problems won't obviously be connected to the (almost silent) failure in CanCreateAuthenticatedEncryptor.  Rather than effectively swallowing such errors, we should allow some retries.

Part of #36157

* Roll our own Lazy that allows resets

Retries against the actual `Key` type weren't working because the exception was getting cached in the key's lazy descriptor.  Implement our own simple lazy and expose a method for clearing the cached value and exception.
@amcasey
Copy link
Member

amcasey commented Apr 29, 2024

This should be substantially better in 9.0 Preview 4. If you continue to see key-not-found errors in Preview 4, please tag me.

@amcasey
Copy link
Member

amcasey commented Oct 4, 2024

I'm going to close this on the (admittedly optimistic assumption) that things will be better in 9.0. If that's not the case, we can open a new issue.

@amcasey amcasey closed this as completed Oct 4, 2024
@amcasey
Copy link
Member

amcasey commented Oct 4, 2024

Changed my mind - I think it's better to have a sink for related requests so I'm going to close the child issues instead and point people here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-dataprotection Includes: DataProtection
Projects
None yet
Development

No branches or pull requests

6 participants