-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Track lifecycle and skip terminating already terminating instances #5411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Track lifecycle and skip terminating already terminating instances #5411
Conversation
Welcome @alam0rt! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
Looks like this was already lgtmed by @grosser . Assigning aws owners for approval. /assign @jaypipes @gjtempleton |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One query.
Can you squash this down to one or two commits please, and then happy to get this merged. |
Avoid deadlocking by locking mutex twice Instantiate test instance with lifecycle state Make method private
7cdba40
to
21e1f5b
Compare
@gjtempleton update was pushed |
Thanks! |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alam0rt, gjtempleton, grosser The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
… instances (kubernetes#5411) Merge pull request kubernetes#5411 from alam0rt/fix-issue-where-terminating-instances-scale-down-asg-to-min Track lifecycle and skip terminating already terminating instances
…nating-instances-scale-down-asg-to-min Track lifecycle and skip terminating already terminating instances
… instances (kubernetes#5411) Merge pull request kubernetes#5411 from alam0rt/fix-issue-where-terminating-instances-scale-down-asg-to-min Track lifecycle and skip terminating already terminating instances
Patch comes from upstream: 43c511d 21e1f5b kubernetes#5411 This has been modified to apply cleanly to 1.22.3
…nating-instances-scale-down-asg-to-min Track lifecycle and skip terminating already terminating instances
Brings in: kubernetes#5411 kubernetes#6166 But the code is so different I manually brought it up to speed.
bring up AWS patch for termination Brings in: kubernetes#5411 kubernetes#6166 But the code is so different I manually brought it up to speed. Also, I added a patch to make tests pass in the latest version of Go.
…nating-instances-scale-down-asg-to-min Track lifecycle and skip terminating already terminating instances
What type of PR is this?
/kind bug
What this PR does / why we need it:
We encountered an issue where the
curSize
cached value would be decremented every time an already terminating instance was signaled for termination again. This caused the desired size of the ASG to dwindle to min size and was made worse when we introduced a change that kept instances in a terminating state for longer than we previously did.The cause of this it seems is because cluster-autoscaler doesn't take any heed of already terminating instances and blindly decrements the
curSize
value,This PR addresses that by caching the lifecycle state of the instance.
Which issue(s) this PR fixes:
Fixes #4095
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: