-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Allow scaling multiple Azure vmss synchronously #2152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@feiskyer: GitHub didn't allow me to request PR reviews from the following users: nilo19. Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR title should be Allow scaling multiple Azure vmss simultaneously
, actually you are using asynchronous way
upcomingNodes = append(upcomingNodes, nodeTemplate.Node()) | ||
// Ensure new nodes having different names because nodeName would used as a map key. | ||
node := nodeTemplate.Node().DeepCopy() | ||
node.Name = fmt.Sprintf("%s-%d", node.Name, rand.Int63()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will this affect non-Azure nodes as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels like serious bug. Thanks for spotting and fixing that.
One note here. Could you please use counter (increased atomically using AddUint64) instead random value.
Also could we mutate UID too? Probably we should extract a helper function for that buildNodeForNodeTemplate
?
cc: @vivekbagade. Vivek could you please scan the code and see if there are no more places where we use node name as dictionary key.
We could also add a sanity check in filterOutSchedulableUsingPacking to detect situation when it gets list of nodes with repeating names.
Also probably we should use UID (not name) as map key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use node name as a dictionary key in a few places. Need to check if they could be causing any issues. Even if they are not, we probably should change this to avoid future issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is because of CreateNodeNameToInfoMap func that could potentially mask a few nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion, would update the PR.
cc: @vivekbagade. Vivek could you please scan the code and see if there are no more places where we use node name as dictionary key.
We could also add a sanity check in filterOutSchedulableUsingPacking to detect situation when it gets list of nodes with repeating names.
Also probably we should use UID (not name) as map key.
That looks good. We can do the check and optimization after the code scan.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please use counter (increased atomically using AddUint64) instead random value.
Actually, the index would be ok. The node name here only used for this single filterOutSchedulableUsingPacking() step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@losipiuk Updated, PTAL
// Invalidate the vmss size cache, so that it would be got by API. | ||
scaleSet.mutex.Lock() | ||
defer scaleSet.mutex.Unlock() | ||
scaleSet.lastRefresh = time.Now().Add(-1 * 15 * time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you explain why we are setting last refresh to -15 seconds from Now()? Also, why not use Sub()? https://golang.org/pkg/time/#Time.Sub
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
15 second is from L122, I renamed it to vmssSizeRefreshPeriod in PR #2151, but forgot to change here. Let me change it also to vmssSizeRefreshPeriod
for clear.
Sub is a different use case, it accepts a param with time.Time
, not time.Duration
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
LGTM on azure part
return found && oldest.Add(unschedulablePodWithGpuTimeBuffer).After(currentTime) | ||
} | ||
|
||
func buildNodeForNodeTemplate(nodeTemplate *schedulernodeinfo.NodeInfo, index int) *apiv1.Node { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
@vivekbagade is doing some testing on this (thanks!).
I will LGTM when we are done with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
@losipiuk My testing is done. Works as expected. LGTM from my side. |
/lgtm |
Thanks! |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: losipiuk The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@losipiuk Thanks |
This PR allows scaling multiple Azure vmss synchronously by delaying the vmss capacity updates in different goroutines.
To make it work, the upcoming nodes from
getUpcomingNodeInfos()
are also changed to different names, so that multiple nodes won't be merged as one node infilterOutSchedulableByPacking()
.Partially fix #2044 (similar node issues are tracked at #2094)
Fix #1984
/cc @andyzhangx @nilo19