-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Fix Azure client requests stuck issues on http.StatusTooManyRequests #2151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@feiskyer: GitHub didn't allow me to request PR reviews from the following users: nilo19. Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@feiskyer In azure cloud provider, we would still retry after hit 409, since it's exponential backoff, is it safe in azure cloud provider code?
|
Yep, it's safe from SDK's client side. CA doesn't need backoff retry since it would run again later. The issue is from go-autorest internal implementations, we should also add some fixes in the Kubernetes cloud provider. By the way, the error code here is 429, not 409. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
Rebased to solve the conflicts |
@andyzhangx @losipiuk Could you help to approve the changes? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: losipiuk The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
In go-autorest SDK https://github.com/Azure/go-autorest/blob/master/autorest/sender.go#L242,
if ARM returns http.StatusTooManyRequests, the sender doesn't increase the retry attempt count,
hence the Azure clients will keep retrying forever until it gets a status code other than 429.
So this PR explicitly remove
http.StatusTooManyRequests
fromautorest.StatusCodesForRetry
. To reduce the API throttling issues, it also adds caches for vmss instances.Fix #2124.
/cc @andyzhangx @nilo19