Skip to content

Disable client-side rate-limiting when AP&F is enabled #111880

@negz

Description

@negz

What would you like to be added?

I'd like client-side rate limiting to be disabled in client-go when API server-side Priority and Fairness is enabled.

All client-go based Kubernetes clients are configured by default to use a token bucket rate limiter in an attempt to avoid overloading the API server. This is a form of open-loop control, in that the clients don't factor in the load of the API server in deciding whether to rate-limit themselves.

Per the below links the default rate limits for a particular client is currently 5 qps, bursting to 10 qps. Discovery clients are particularly noisy and have thus had their burst raised most recently to 300 qps. I believe this number was picked because it's around the number of API groups that need to be discovered (i.e. HTTP request that need to be made in short order) when @crossplane is installed with all of the "big three" providers (i.e. CRDs representing all AWS, Azure, and GCP APIs).

API Priority and Fairness has been enabled by default (as a beta feature) since Kubernetes v1.20. It allows the API server to prioritize and queue requests, and can let a particular client know when it should limit its request rate by returning a response with HTTP status code 429 "Too Many Requests". REST clients appear to respect this status code and will backoff their requests when they encounter it per https://github.com/kubernetes/client-go/blob/a890e7b/rest/urlbackoff.go.

Disabling client-side rate-limiting appears to be a goal of AP&F per its graduation criteria:

PF allows us to disable client-side rate limiting without causing the apiservers to wedge/crash. Note that there is another level of concern that APF does not attempt to address, which is mismatch between the throughput that various controllers can sustain.

There seems to be a general consensus amongst API machinery maintainers that we should just stop worrying and learn to love AP&F per #105520 (comment).

Why is this needed?

The open-loop nature of client-go's rate limiter means that it's possible (indeed likely) that clients will rate limit themselves too much or not enough. The former is particularly painful for CLI tools like kubectl, helm, and kpt where a series of requests taking longer than they need to directly results in a particular CLI command taking longer than it needs to. Again this is an area where discovery is particularly painful - when kubectl's discovery burst was set to 100 qps we were seeing it take more than 5 minutes for some commands to complete, with users seeing the below logs while waiting for their command to finish.

Waited for 1.033772408s due to client-side throttling, not priority and fairness, request: GET:https://api.example.org/apis/pkg.crossplane.io/v1?timeout=32s

To this end several tools have bumped or disabled their client-side rate limits, e.g.:

Notably I'm pretty sure @crossplane will be exceeding the 300 qps discovery burst fairly soon, which will result in another round of PRs to bump the limit further if we don't remove it entirely.

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/cleanupCategorizes issue or PR as related to cleaning up code, process, or technical debt.kind/featureCategorizes issue or PR as related to a new feature.sig/api-machineryCategorizes an issue or PR as relevant to SIG API Machinery.triage/acceptedIndicates an issue or PR is ready to be actively worked on.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions