-
Notifications
You must be signed in to change notification settings - Fork 5k
Rate limit APIs #52079
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Tagging subscribers to this area: @tarekgh, @buyaa-n, @krwq Issue DetailsBackground and MotivationOutages caused when system activities exceed the system’s capacity is a leading concern in system design. The ability to handle system activity efficiently, and gracefully limit the execution of activities before the system is under stress is a fundamental to system resiliency. .NET does not have a standardized means for expressing and managing resource limiting logic needed to produce a resilient system. This adds complexity to designing and developing resilient software in .NET by introducing an easy vector for competing resource limiting logic and anti-patterns. A standardized interface in .NET for limiting activities will make it easier for developers to build resilient systems for all scales of deployment and workload. Users will interact with the proposed APIs in order to ensure rate and/or concurrency limits are enforced. This abstraction require explicit release semantics to accommodate non self-replenishing (i.e. concurrency) resource limits similar to how Semaphores operate. The abstraction also accounts for self-replenishing (i.e. rate) resource limits where no explicit release semantics are needed as the resource is replenished automatically over time. This component encompasses the TryAcquire/AcquireAsync mechanics (i.e. check vs wait behaviours) and default implementations will be provided for select accounting method (fixed window, sliding window, token bucket, simple concurrency). The return type is a Proposed APIpublic interface IResourceLimiter
{
// An estimated count of resources. E
long EstimatedCount { get; }
// Fast synchronous attempt to acquire resources.
// Set requestedCount to 0 to get whether resource limit has been reached.
bool TryAcquire(long requestedCount, out Resource resource);
// Wait until the requested resources are available.
// Set requestedCount to 0 to wait until resource is replenished.
// An exception is thrown if resources cannot be obtained.
ValueTask<Resource> AcquireAsync(long requestedCount, CancellationToken cancellationToken = default);
}
public struct Resource : IDisposable
{
// This represents additional metadata that can be returned as part of a call to TryAcquire/AcquireAsync
// Potential uses could include a RetryAfter value.
public object? State { get; init; }
// Constructor
public Resource(long count, object? state, Action<Resource>? onDispose)
// Return the acquired resources
public void Dispose()
// This static field can be used for rate limiters that do not require release semantics or for failed concurrency limiter acquisition requests.
public static Resource NoopResource = new Resource(null, null);
}
// Extension methods
public static class ResourceLimiterExtensions
{
public static bool TryAcquire(this IResourceLimiter limiter, out Resource resource)
{
return limiter.TryAcquire(1, out resource);
}
public static ValueTask<Resource> AcquireAsync(this IResourceLimiter limiter, CancellationToken cancellationToken = default)
{
return limiter.AcquireAsync(1, cancellationToken);
}
} The struct Usage ExamplesFor components enforcing limits, the standard usage pattern will be: if (limiter.TryAcquire(1, out var resource))
{
// Resource obtained successfully.
using (resource)
{
// Continue with processing
// Resource released when disposed
}
}
else
{
// Limit exceeded, no resources obtained
} In cases where it is known that the resource limiter is a rate limit with no-op release semantics, the usage can be simplified to: if (limiter.TryAcquire(1, out _))
{
// Resource obtained successfully.
// Continue with processing
}
else
{
// Limit exceeded, no resources obtained
} This API will be useful in implementing limits for various BCL types including:
Ongoing work to prototype intended usage in these BCL types and default implementations for Fixed Window, Sliding Window, TokenBucket algorithms. For example, a rate limit applied to a BoundedChannel: // Rate limiter added to options
var rateLimiter = new FixedWindowRateLimiter(resourcePerSecond: 5);
var rateLimitedChannel = Channel.CreateBounded<string>(new BoundedChannelOptions(5) { WriteRateLimiter = rateLimiter });
// This channel will now only write 5 times per second
rateLimitedChannel.Writer.TryWrite("New message"); Ongoing experiments in ASP.NET Core for application in Kestrel server limits and a middleware for enforcing limits on request processing is ongoing at https://github.com/dotnet/aspnetcore/tree/johluo/rate-limits. We also adoption for enforcing limits in YARP as well as conversion of existing implementations in ATS and ACR. Alternative DesignsMajor variants considered Separate abstractions for rate and concurrency limitsA design where rate limits and concurrency limits were expressed by separate abstractions was considered. The design more clearly express the intended use pattern where rate limits do not need to return a However, this design has the drawback for consumers of resource limits since there are two possible limiter types that can be specified by the user. To alleviate some of the complexity, a wrapper for rate limits was considered. However, the complexity of this design was deemed undesirable and a unified abstraction for rate and concurrency limits was preferred. Release APIs on IResourcelimiterInstead of using the A class instead of struct for ResourceThis approach allows for subclassing to include additional metadata instead of an Partial acquisition and releaseCurrently, the acquisition and release of resources is all-or-nothing. Additional APIs will be needed to allow for the ability to acquire a part of the requested resources. For example, 5 resources were requested but willing to accept a subset of the requested resources if not all 5 is available. Similarly, additional APIs can be added to These APIs are not included in this proposal since no concrete use cases has been currently identified. RisksThis is a proposal for new API and main concerns include:
|
Tagging subscribers to this area: @eerhardt, @maryamariyan Issue DetailsBackground and MotivationOutages caused when system activities exceed the system’s capacity is a leading concern in system design. The ability to handle system activity efficiently, and gracefully limit the execution of activities before the system is under stress is a fundamental to system resiliency. .NET does not have a standardized means for expressing and managing resource limiting logic needed to produce a resilient system. This adds complexity to designing and developing resilient software in .NET by introducing an easy vector for competing resource limiting logic and anti-patterns. A standardized interface in .NET for limiting activities will make it easier for developers to build resilient systems for all scales of deployment and workload. Users will interact with the proposed APIs in order to ensure rate and/or concurrency limits are enforced. This abstraction require explicit release semantics to accommodate non self-replenishing (i.e. concurrency) resource limits similar to how Semaphores operate. The abstraction also accounts for self-replenishing (i.e. rate) resource limits where no explicit release semantics are needed as the resource is replenished automatically over time. This component encompasses the TryAcquire/AcquireAsync mechanics (i.e. check vs wait behaviours) and default implementations will be provided for select accounting method (fixed window, sliding window, token bucket, simple concurrency). The return type is a Proposed APIpublic interface IResourceLimiter
{
// An estimated count of resources. Potential uses include diagnostics.
long EstimatedCount { get; }
// Fast synchronous attempt to acquire resources.
// Set requestedCount to 0 to get whether resource limit has been reached.
bool TryAcquire(long requestedCount, out Resource resource);
// Wait until the requested resources are available.
// Set requestedCount to 0 to wait until resource is replenished.
// An exception is thrown if resources cannot be obtained.
ValueTask<Resource> AcquireAsync(long requestedCount, CancellationToken cancellationToken = default);
}
public struct Resource : IDisposable
{
// This represents additional metadata that can be returned as part of a call to TryAcquire/AcquireAsync
// Potential uses could include a RetryAfter value.
public object? State { get; init; }
// Constructor
public Resource(long count, object? state, Action<Resource>? onDispose);
// Return the acquired resources
public void Dispose();
// This static field can be used for rate limiters that do not require release semantics or for failed concurrency limiter acquisition requests.
public static Resource NoopResource = new Resource(null, null);
}
// Extension methods
public static class ResourceLimiterExtensions
{
public static bool TryAcquire(this IResourceLimiter limiter, out Resource resource)
{
return limiter.TryAcquire(1, out resource);
}
public static ValueTask<Resource> AcquireAsync(this IResourceLimiter limiter, CancellationToken cancellationToken = default)
{
return limiter.AcquireAsync(1, cancellationToken);
}
} These APIs will likely be added to a new namespace and assembly, potentially The struct Usage ExamplesFor components enforcing limits, the standard usage pattern will be: if (limiter.TryAcquire(1, out var resource))
{
// Resource obtained successfully.
using (resource)
{
// Continue with processing
// Resource released when disposed
}
}
else
{
// Limit exceeded, no resources obtained
} In cases where it is known that the resource limiter is a rate limit with no-op release semantics, the usage can be simplified to: if (limiter.TryAcquire(1, out _))
{
// Resource obtained successfully.
// Continue with processing
}
else
{
// Limit exceeded, no resources obtained
} This API will be useful in implementing limits for various BCL types including:
Ongoing work to prototype intended usage in these BCL types and default implementations for Fixed Window, Sliding Window, TokenBucket algorithms. For example, a rate limit applied to a BoundedChannel: // Rate limiter added to options
var rateLimiter = new FixedWindowRateLimiter(resourcePerSecond: 5);
var rateLimitedChannel = Channel.CreateBounded<string>(new BoundedChannelOptions(5) { WriteRateLimiter = rateLimiter });
// This channel will now only write 5 times per second
rateLimitedChannel.Writer.TryWrite("New message"); Ongoing experiments in ASP.NET Core for application in Kestrel server limits and a middleware for enforcing limits on request processing is ongoing at https://github.com/dotnet/aspnetcore/tree/johluo/rate-limits. We also adoption for enforcing limits in YARP as well as conversion of existing implementations in ATS and ACR. Alternative DesignsMajor variants considered Separate abstractions for rate and concurrency limitsA design where rate limits and concurrency limits were expressed by separate abstractions was considered. The design more clearly express the intended use pattern where rate limits do not need to return a However, this design has the drawback for consumers of resource limits since there are two possible limiter types that can be specified by the user. To alleviate some of the complexity, a wrapper for rate limits was considered. However, the complexity of this design was deemed undesirable and a unified abstraction for rate and concurrency limits was preferred. Release APIs on IResourcelimiterInstead of using the A class instead of struct for ResourceThis approach allows for subclassing to include additional metadata instead of an Partial acquisition and releaseCurrently, the acquisition and release of resources is all-or-nothing. Additional APIs will be needed to allow for the ability to acquire a part of the requested resources. For example, 5 resources were requested but willing to accept a subset of the requested resources if not all 5 is available. Similarly, additional APIs can be added to These APIs are not included in this proposal since no concrete use cases has been currently identified. RisksThis is a proposal for new API and main concerns include:
|
@JunTaoLuo - Oh, I see it buried in the proposal:
Can you put the proposed namespace in the |
Do we expect any types in |
Will do!
Yes, I expect we'll be adding default implementations to the BCL such as Rate Limiters (Fixed Window, Sliding Window, Token Bucket) and potentially a semaphore based Concurrency Limiter. I've so been prototyping these implementations but I don't have anything reviewable yet. I will share more details soon. I expect that as a result of these prototypes, there will be some additional APIs, such as an enum/option to configure between stack/queue when waiting via |
While this does seem like a good idea to have, this API can be misused. And, most systems can already manage their resources very efficiently. |
Tagging subscribers to this area: @carlossanlop Issue DetailsBackground and MotivationOutages caused when system activities exceed the system’s capacity is a leading concern in system design. The ability to handle system activity efficiently, and gracefully limit the execution of activities before the system is under stress is a fundamental to system resiliency. .NET does not have a standardized means for expressing and managing resource limiting logic needed to produce a resilient system. This adds complexity to designing and developing resilient software in .NET by introducing an easy vector for competing resource limiting logic and anti-patterns. A standardized interface in .NET for limiting activities will make it easier for developers to build resilient systems for all scales of deployment and workload. Users will interact with the proposed APIs in order to ensure rate and/or concurrency limits are enforced. This abstraction require explicit release semantics to accommodate non self-replenishing (i.e. concurrency) resource limits similar to how Semaphores operate. The abstraction also accounts for self-replenishing (i.e. rate) resource limits where no explicit release semantics are needed as the resource is replenished automatically over time. This component encompasses the TryAcquire/AcquireAsync mechanics (i.e. check vs wait behaviours) and default implementations will be provided for select accounting method (fixed window, sliding window, token bucket, simple concurrency). The return type is a Proposed APInamespace System.Threading.ResourceLimit
{
public interface IResourceLimiter
{
// An estimated count of resources. Potential uses include diagnostics.
long EstimatedCount { get; }
// Fast synchronous attempt to acquire resources.
// Set requestedCount to 0 to get whether resource limit has been reached.
bool TryAcquire(long requestedCount, out Resource resource);
// Wait until the requested resources are available.
// Set requestedCount to 0 to wait until resource is replenished.
// An exception is thrown if resources cannot be obtained.
ValueTask<Resource> AcquireAsync(long requestedCount, CancellationToken cancellationToken = default);
}
public struct Resource : IDisposable
{
// This represents additional metadata that can be returned as part of a call to TryAcquire/AcquireAsync
// Potential uses could include a RetryAfter value.
public object? State { get; init; }
// Constructor
public Resource(long count, object? state, Action<Resource>? onDispose);
// Return the acquired resources
public void Dispose();
// This static field can be used for rate limiters that do not require release semantics or for failed concurrency limiter acquisition requests.
public static Resource NoopResource = new Resource(null, null);
}
// Extension methods
public static class ResourceLimiterExtensions
{
public static bool TryAcquire(this IResourceLimiter limiter, out Resource resource)
{
return limiter.TryAcquire(1, out resource);
}
public static ValueTask<Resource> AcquireAsync(this IResourceLimiter limiter, CancellationToken cancellationToken = default)
{
return limiter.AcquireAsync(1, cancellationToken);
}
}
} The struct Usage ExamplesFor components enforcing limits, the standard usage pattern will be: if (limiter.TryAcquire(1, out var resource))
{
// Resource obtained successfully.
using (resource)
{
// Continue with processing
// Resource released when disposed
}
}
else
{
// Limit exceeded, no resources obtained
} In cases where it is known that the resource limiter is a rate limit with no-op release semantics, the usage can be simplified to: if (limiter.TryAcquire(1, out _))
{
// Resource obtained successfully.
// Continue with processing
}
else
{
// Limit exceeded, no resources obtained
} This API will be useful in implementing limits for various BCL types including:
Ongoing work to prototype intended usage in these BCL types and default implementations for Fixed Window, Sliding Window, TokenBucket algorithms. For example, a rate limit applied to a BoundedChannel: // Rate limiter added to options
var rateLimiter = new FixedWindowRateLimiter(resourcePerSecond: 5);
var rateLimitedChannel = Channel.CreateBounded<string>(new BoundedChannelOptions(5) { WriteRateLimiter = rateLimiter });
// This channel will now only write 5 times per second
rateLimitedChannel.Writer.TryWrite("New message"); Ongoing experiments in ASP.NET Core for application in Kestrel server limits and a middleware for enforcing limits on request processing is ongoing at https://github.com/dotnet/aspnetcore/tree/johluo/rate-limits. We also adoption for enforcing limits in YARP as well as conversion of existing implementations in ATS and ACR. Alternative DesignsMajor variants considered Separate abstractions for rate and concurrency limitsA design where rate limits and concurrency limits were expressed by separate abstractions was considered. The design more clearly express the intended use pattern where rate limits do not need to return a However, this design has the drawback for consumers of resource limits since there are two possible limiter types that can be specified by the user. To alleviate some of the complexity, a wrapper for rate limits was considered. However, the complexity of this design was deemed undesirable and a unified abstraction for rate and concurrency limits was preferred. Release APIs on IResourcelimiterInstead of using the A class instead of struct for ResourceThis approach allows for subclassing to include additional metadata instead of an Partial acquisition and releaseCurrently, the acquisition and release of resources is all-or-nothing. Additional APIs will be needed to allow for the ability to acquire a part of the requested resources. For example, 5 resources were requested but willing to accept a subset of the requested resources if not all 5 is available. Similarly, additional APIs can be added to These APIs are not included in this proposal since no concrete use cases has been currently identified. RisksThis is a proposal for new API and main concerns include:
|
This feedback isn't specific enough to comment on. Can you clarify? |
In very poorly designed multithreaded applications, this could cause problems when thread A is expecting a value from thread B and can't get it because thread B is being resource limited. As this is very machine-specific and testing for this would be pretty hard, I think this might be an issue in extremely remote cases. Anyways, I absolutely don't think this alone should be a valid reason to scrap this whole API idea, I just think it's worthwhile to mention. |
@mangod9 - Can you comment why you moved this issue to the
Basically any resource that could be "limited". So putting it in the |
Moved back to threading. We discussed this with @stephentoub and it's generic enough. |
sure, seems reasonable. |
Do you expect that e.g. HttpClient will depend on these interfaces? From the description so far, this looks like one of those abstractions that meets with the rest of the stack in the app model specific libraries. It is why I have put it into Extensions initially. |
I see, I'll update the proposal and samples to use the template method pattern. |
API Review notes:
Package name: System.Threading.RateLimiting namespace System.Threading.RateLimiting
{
public abstract class RateLimiter
{
public abstract int GetAvailablePermits();
public RateLimitLease Acquire(int permitCount = 1);
protected abstract RateLimitLease AcquireCore(int permitCount);
public ValueTask<RateLimitLease> WaitAsync(int permitCount = 1, CancellationToken cancellationToken = default);
protected abstract ValueTask<RateLimitLease> WaitAsyncCore(int permitCount, CancellationToken cancellationToken);
}
public abstract class RateLimitLease : IDisposable
{
public abstract bool IsAcquired { get; }
public abstract bool TryGetMetadata(string metadataName, out object? metadata);
public bool TryGetMetadata<T>(MetadataName<T> metadataName, [MaybeNullWhen(false)] out T metadata);
public abstract IEnumerable<string> MetadataNames { get; }
public virtual IEnumerable<KeyValuePair<string, object?>> GetAllMetadata();
public void Dispose() { Dispose(true); GC.SuppressFinalize(this); }
protected virtual void Dispose(bool disposing);
}
public static class MetadataName
{
public static MetadataName<TimeSpan> RetryAfter { get; } = Create<TimeSpan>("RETRY_AFTER");
public static MetadataName<string> ReasonPhrase { get; } = Create<string>("REASON_PHRASE");
public static MetadataName<T> Create<T>(string name) => new MetadataName<T>(name);
}
public sealed class MetadataName<T> : IEquatable<MetadataName<T>>
{
public MetadataName(string name);
public string Name { get; }
}
} |
namespace System.Threading.RateLimiting
{
// This specifies the behaviour of `WaitAsync` When PermitLimit has been reached
public enum QueueProcessingOrder
{
ProcessOldest,
ProcessNewest
}
public sealed class ConcurrencyLimiterOptions
{
public ConcurrencyLimiterOptions(int permitLimit, QueueProcessingOrder queueProcessingOrder, int queueLimit);
// Specifies the maximum number of permits for the limiter
public int PermitLimit { get; }
// Permits exhausted mode, configures `WaitAsync` behaviour
public QueueProcessingOrder QueueProcessingOrder { get; }
// Queue limit when queuing is enabled
public int QueueLimit { get; }
}
public sealed class TokenBucketRateLimiterOptions
{
public TokenBucketRateLimiterOptions(
int tokenLimit,
QueueProcessingOrder queueProcessingOrder,
int queueLimit,
TimeSpan replenishmentPeriod,
int tokensPerPeriod,
bool autoReplenishment = true);
// Specifies the maximum number of permits for the limiter
public int TokenLimit { get; }
// Permits exhausted mode, configures `WaitAsync` behaviour
public QueueProcessingOrder QueueProcessingOrder { get; }
// Queue limit when queuing is enabled
public int QueueLimit { get; }
// Specifies the period between replenishments
public TimeSpan ReplenishmentPeriod { get; }
// Specifies how many tokens to restore each replenishment
public int TokensPerPeriod { get; }
// Whether to create a timer to trigger replenishment automatically
// This parameter is optional
public bool AutoReplenishment { get; }
}
// Window based rate limiter options
public sealed class FixedWindowRateLimiterOptions
{
public FixedWindowRateLimiterOptions(
int permitLimit,
QueueProcessingOrder queueProcessingOrder,
int queueLimit,
TimeSpan window,
bool autoRefresh = true);
// Specifies the maximum number of permits for the limiter
public int PermitLimit { get; }
// Permits exhausted mode, configures `WaitAsync` behaviour
public QueueProcessingOrder QueueProcessingOrder { get; }
// Queue limit when queuing is enabled
public int QueueLimit { get; }
// Specifies the duration of the window where the rate limit is applied
public TimeSpan Window { get; }
public bool AutoRefresh { get; }
}
public sealed class SlidingWindowRateLimiterOptions
{
public SlidingWindowRateLimiterOptions(
int permitLimit,
QueueProcessingOrder queueProcessingOrder,
int queueLimit,
TimeSpan window,
int segmentsPerWindow,
bool autoRefresh = true);
// Specifies the maximum number of permits for the limiter
public int PermitLimit { get; }
// Permits exhausted mode, configures `WaitAsync` behaviour
public QueueProcessingOrder QueueProcessingOrder { get; }
// Queue limit when queuing is enabled
public int QueueLimit { get; }
// Specifies the duration of the window where the rate limit is applied
public TimeSpan Window { get; }
// Specifies the number of segments the Window should be divided into
public int SegmentsPerWindow { get; }
public bool AutoRefresh { get; }
}
// Limiter implementations
public sealed class ConcurrencyLimiter : RateLimiter
{
public ConcurrencyLimiter(ConcurrencyLimiterOptions options);
public override int GetAvailablePermits();
protected override RateLimitLease AcquireCore(int permitCount);
protected override ValueTask<RateLimitLease> WaitAsyncCore(int permitCount, CancellationToken cancellationToken = default);
}
public sealed class TokenBucketRateLimiter : RateLimiter
{
public FixedWindowRateLimiter(FixedWindowRateLimiterOptions options);
// Attempts replenish the bucket, returns true if enough time had elapsed and it replenishes; otherwise, false.
public bool TryReplenish();
public override int GetAvailablePermits();
protected override RateLimitLease AcquireCore(int permitCount);
protected override ValueTask<RateLimitLease> WaitAsyncCore(int permitCount, CancellationToken cancellationToken = default);
}
public sealed class FixedWindowRateLimiter : RateLimiter
{
public FixedWindowRateLimiter(FixedWindowRateLimiterOptions options);
public bool TryRefresh();
public override int GetAvailablePermits();
protected override RateLimitLease AcquireCore(int permitCount);
protected override ValueTask<RateLimitLease> WaitAsyncCore(int permitCount, CancellationToken cancellationToken = default);
}
public sealed class SlidingWindowRateLimiter : RateLimiter
{
public SlidingWindowRateLimiter(SlidingWindowRateLimiterOptions options);
public bool TryRefresh();
public override int GetAvailablePermits();
protected override RateLimitLease AcquireCore(int permitCount);
protected override ValueTask<RateLimitLease> WaitAsyncCore(int permitCount, CancellationToken cancellationToken = default);
}
} |
What is going to happen to this now? The labels suggest this will make it into .NET 6, has work on this started already / is it assigned to someone? Would a community contribution be helpful here? I'm very excited to get my hands on this 😄 |
@HurricanKai you should hopefully see a WIP draft pull request that will reference this issue. Excited for this too! |
Looking forward to it! |
Moving to 7.0 as these APIs won't be ready for 6. |
In many networking scenarios, rate limits change dynamically based on messages being received, e.g. based on retry-after HTTP headers. It seems to me like a "message" needs to be one of the inputs to the APIs computing the limits. |
Based on the examples, I can see how this could be considered a solution for rate limiting in HTTP applications. Some limitations I can see are:
|
The need to always check the "did I really get the lease I wanted" property (which is currently IsAcquired, I think?) after every call is a real deal breaker for me. That's far too easy to mess up... to the point that I would probably just keep copy/pasting the same wrappers classes I paste into every project I start rather than adopt this. And in many scenarios, you won't realize you've got incorrect code until you try to run your workloads at scale, since the only interaction will be Acquire/Dispose, and whatever the limit was trying to limit isn't limited... and then you have an outage. That seems like a troubling pattern. |
While working through the API reviews above and implementing the
For those interested in trying out the rate limiting APIs and current implementations you can find the package |
cc @stephentoub |
I think this is very exciting work (also the larger work in ASP.NET). If a
Here it says that it's a request and not a lease per-se. I read the discussion of the alternative design (returning bool). I think we can have the best of both worlds:
The current design reminds me of the regex design where you get back a What is |
Added in #61788 |
This is a summary of the Design doc.
Background and Motivation
Outages caused when system activities exceed the system’s capacity is a leading concern in system design. The ability to handle system activity efficiently, and gracefully limit the execution of activities before the system is under stress is a fundamental to system resiliency. .NET does not have a standardized means for expressing and managing rate limiting logic needed to produce a resilient system. This adds complexity to designing and developing resilient software in .NET by introducing an easy vector for competing rate limiting logic and anti-patterns. A standardized interface in .NET for limiting activities will make it easier for developers to build resilient systems for all scales of deployment and workload.
Users will interact with the proposed APIs in order to ensure rate and/or concurrency limits are enforced. This abstraction require explicit release semantics to accommodate non self-replenishing (i.e. concurrency) limits similar to how Semaphores operate. The abstraction also accounts for self-replenishing (i.e. rate) limits where no explicit release semantics are needed as the permits are replenished automatically over time. This component encompasses the Acquire/WaitAsync mechanics (i.e. check vs wait behaviours) and default implementations will be provided for select accounting method (fixed window, sliding window, token bucket, simple concurrency). The return type is a
RateLimitLease
type which indicates whether acquisition is successful and manages the lifecycle of the acquired permits.Proposed API - Abstractions
The
Acquire
call represents a fast synchronous check that immediately returns whether there are enough permits available to continue with the operation and atomically acquires them if there are, returningRateLimitLease
with the valueRateLimitLease.IsAcquired
representing whether the acquisition is successful and the lease itself representing the acquired permits, if successful. The user can pass in apermitCount
of 0 to check whether the permit limit has been reached without acquiring any permits.WaitAsync
, on the other hand, represents an awaitable request to check whether permits are available. If permits are available, obtain the permits and return immediately with aRateLimitLease
representing the acquired permits. If the permits are not available, the caller is willing to pause the operation and wait until the necessary permits become available. The user can also pass in apermitCount
of 0 but and indicates the user wants to wait until more permits become available.GetAvailablePermits()
is envisioned as a flexible and simple way for the limiter to communicate the status of the limiter to the user. This count is similar in essence toSemaphoreSlim.CurrentCount
. This count can also be used in diagnostics to track the usage of the rate limiter.The abstract class
RateLimitLease
is used to facilitate the release semantics of rate limiters. That is, for non self-replenishing, the returning of the permits obtained via Acquire/WaitAsync is achieved by disposing theRateLimitLease
. This enables the ability to ensure that the user can't release more permits than was obtained.The
RateLimitLease.IsAcquired
property is used to express whether the acquisition request was successful.TryGetMetadata()
is implemented by subclasses to allow for returning additional metadata as part of the rate limit decision. A curated list of well know names for commonly used metadata is provided viaMetadataName
which keeps a list ofMetadataName<T>
s which are wrappers ofstring
and a type parameter indicating the value type. To optimize performance, implementations will need to poolRateLimitLease
.Usage Examples
For components enforcing limits, the standard usage pattern will be:
Propsed API - Concrete Implementations
For more details on how these options work, see the Design Doc.
Adoption samples
This API will be used in implementing ASP.NET Core middleware in .NET 6.0 and can be useful in implementing limits for various BCL types in the future including:
Sample implementation in Channels, note this is using slightly outdated API.
For more theoretical samples of
RateLimiter
implementations, see the Proof of Concepts in the Design Doc.We also adoption for enforcing limits in YARP as well as conversion of existing implementations in ATS and ACR.
Alternative Designs
Token bucket rate limiter external replenishment
The default implementation will allocate a new
System.Threading.Timer
to trigger permit replenishment. This can be expensive when many limiters are in use and a better pattern is to trigger the replenishment via a singleTimer
. The current proposal has two APIs to support this, apublic void Replenish()
on the limiter and apublic bool AutoReplenishment { get;set; }
on the options class.Subclasses can override default behaviour
Instead of exposing the two APIs, we can make the class extensible and allow subclasses to add the
Replenish()
method as well as the external replenishment functionality. However, theAutoReplenishment
still need to exist so the default implementation knows if a Timer needs to be created.Heuristics based replenishment.
We can rely on recomputing the permit count based on how long since the last replenishment occurred on every invocation of
Acquire
,WaitAsync
andGetAvailablePermits
. However, we'll still need to allocate a Timer to process queuedWaitAsync
calls.Separate abstractions for rate and concurrency limits
A design where rate limits and concurrency limits were expressed by separate abstractions was considered. The design more clearly express the intended use pattern where rate limits do not need to return a
RateLimitLease
and does not possess release semantics. In comparison, the proposed design where the release semantics for rate limits will no-op.However, this design has the drawback for consumers of rate limits since there are two possible limiter types that can be specified by the user. To alleviate some of the complexity, a wrapper for rate limits was considered. However, the complexity of this design was deemed undesirable and a unified abstraction for rate and concurrency limits was preferred.
A struct instead of class for RateLimitLease
This approach was considered since allocating a new
RateLimitLease
for each acquisition request is considered to be a performance bottleneck. The design evolved to the following:However, this design became problematic with the consideration of including a
AggregatedRateLimiter<TKey>
which necessitates the existence of another structRateLimitLease<TKey>
with a private reference to theAggregatedRateLimiter<TKey>
. This bifurcation of the return types ofAcquire
andWaitAsync
between theAggregatedRateLimiter<TKey>
andRateLimiter
make it very difficult to consume aggregated and simple limiters in a consistent manner. Additional complexity in definiting an API to store and retrieve additional metadata is also a concern, see below. For this reason, it is better to makeRateLimitLease
a class instead of a struct and require implementations to pool if optimization for performance is required.Additional concerns that needed to be resolved for a struct
RateLimitLease
are elaborated below:Permit as reference ID
There was alternative proposal where the struct only contains a reference ID and additional APIs on the
RateLimiter
instance is used to return permits and obtain additional metadata. This is equivalent to theRateLimiter
internally tracking outstanding permit leases and allow permit release viaRateLimiter.Release(RateLimitLease.ID)
or obtain additional metadata viaRateLimiter.TryGetMetadata(RateLimitLease.ID, MetadataName)
. This shifts the need to pool data structures for tracking idempotency ofDispose
and additional metadata to theRateLimiter
implementation itself. This additional indirection doesn't resolve the bifurcation issue mentioned previously and necessitates additional APIs that are hard to use and implement on theRateLimiter
, as such this alternative is not chosen.RateLimitLease state
The current proposal uses a
object State
to communicate additional information on a rate limit decision. This is the most general way to provide additional information since theRateLimiter
can add any arbitrary type or collections viaobject State
. However, there is a tradeoff between the generality and flexibility of this approach with usability. For example, we have gotten feedback from ATS that they want a simpler way to specify a set of values such as RetryAfter, error codes, or percentage of permits used. As such, here are several design alternatives.Interfaces
One option to support access to values is to keep the
object State
but require limiters to set a state that implements different Interfaces. For example, there could be aIRateLimiterRetryAfterHeaderValue
interface that looks like:Consumers of the
RateLimiter
would then check if theState
object implements the interface before retrieving the value. It also puts burdens on the implementers ofRateLimiters
since they should also define a set interfaces to represent commonly used values.Property bags
Property bags like
Activity.Baggage
andActivity.Tags
are very well suited to store the values that were identified by the ATS team. For web work loads where these values are likely to be headers and header value pairs, this is a good way to express theState
field onRateLimitLease
. Specifically, the type would be either:Option 1:
IReadonlyDictionary<string,string?> State
However, there is a drawback here in terms of generality since it would mean that we are opinionated about the type of keys and values as strings. Alternatively we can modify this to be:
Option 2:
IReadonlyDictionary<string,object?> State
This is slightly more flexible since the value can be any type. However, to use these values, the user would need to know ahead of time what the value for specific keys are and downcast the object to whatever type it is. Going one step further:
Option 3:
IReadonlyDictionary<object,object?> State
This gives the most flexibility in the property bag, since we are no longer opinionated about the key type. But the same issue with option 2 remains and it's unclear whether this generality of key type would actually be useful.
Feature collection
Another way to represent the
State
would be something like aIFeatureCollection
. The benefit of this interface is that while it is general enough to contain any type of value and that specific implementations can optimize for commonly accessed fields by accessing them directly (e.g. https://github.com/dotnet/aspnetcore/blob/52eff90fbcfca39b7eb58baad597df6a99a542b0/src/Http/Http/src/DefaultHttpContext.cs).A
bool
returned byTryAcquire
to indicate success/failure and throw forWaitAsync
to indicate failureAn earlier iteration proposed the following API instead:
This was proposed since the method name
TryAcquire
seemed to convey the idea that it is a quick synchronous check. However, this also impacted the shape of the API to returnbool
by convention and return additional information via out parameters. If a limiter wants to communicate a failure for aWaitAsync
, it would throw an exception. This may occur if the limiter has reached the hard cap. The drawback here is that these scenarios, which may be frequent depending on the scenario, will necessitate an allocation of anException
type.Another alternative was identified with
WaitAsync
returning a tuple, i.e.ValueTask<(bool, RateLimitLease)> WaitAsync(...)
. The consumption pattern would then look like:Release APIs on RateLimiter
Instead of using
RateLimitLease
to track release of permits an alternative approach proposes adding avoid Release(int releaseCount)
method onRateLimiter
and require users to call this method explicitly. However, this requires the user to call release with the correct count which can be error prone and theRateLimitLease
approach was preferred.Partial acquisition and release
Currently, the acquisition and release of permits is all-or-nothing.
Additional APIs will be needed to allow for the ability to acquire a part of the requested permits. For example, 5 permits were requested but willing to accept a subset of the requested permits if not all 5 is available.
Similarly, additional APIs can be added to
RateLimitLease
to facilitate the release a part of the acquired permits. For example, 5 permits are obtained, but as processing continues, each permit can be released individually.These APIs are not included in this proposal since no concrete use cases has been currently identified.
Risks
This is a proposal for new API and main concerns include:
The text was updated successfully, but these errors were encountered: