-
Notifications
You must be signed in to change notification settings - Fork 4.2k
fix: ignore agentpool label when looking for similar node groups with Azure provider #2094
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: ignore agentpool label when looking for similar node groups with Azure provider #2094
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: If they are not already assigned, you can assign the PR to them by writing The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
To allow scaling similar node pools simultaneously, or when using separate node groups per zone and to keep nodes balanced across zones, use the `--balance-similar-node-groups` flag. Add it to the `command` section to enable it: | ||
|
||
```yaml | ||
- --balance-similar-node-groups=true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@feiskyer do you know if this flag defaults to true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the default value is false.
The whole point of making node group comparison a 'processor' (ie. hiding it behind an interface) was that such cloudprovider specific changes can be easily isolated. Please check this commit, which shows code addressing the exact same problem for GKE: 34a4262#diff-83aae18ad286ee688abea380ae132efa (the commit removes the relevant code, since GKE is now supported from fork and there was no reason to keep GKE specific logic in main repo, but the code still illustrates our preferred pattern). |
@CecileRobertMichon Could you update the PR according to 34a4262#diff-83aae18ad286ee688abea380ae132efa? |
processors.PodListProcessor = core.NewFilterOutSchedulablePodListProcessor() | ||
if autoscalingOptions.CloudProviderName == azure.ProviderName { | ||
processors.NodeGroupSetProcessor = &nodegroupset.BalancingNodeGroupSetProcessor{ | ||
Comparator: nodegroupset.IsAzureNodeInfoSimilar} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this overrides the node comparator for Azure
return true | ||
} | ||
|
||
func compareLabels(nodes []*schedulernodeinfo.NodeInfo, ignoredLabels map[string]bool) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
refactored this into a function
// somewhat arbitrary, but generally we check if resources provided by both nodes | ||
// are similar enough to likely be the same type of machine and if the set of labels | ||
// is the same (except for a pre-defined set of labels like hostname or zone). | ||
func IsNodeInfoSimilar(n1, n2 *schedulernodeinfo.NodeInfo) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a thin wrapper around the existing function that provides the default set of ignored labels for all cloud providers.
@MaciekPytel thanks for the suggestion. I've updated the PR, let me know what you think. I wasn't able to just reuse the implementation for GKE (although I did reuse the same pattern for overriding the node comparator) because the requirement here is slightly different: the GKE implementation allows two nodes to be "similar" if they have the same node pool label (regardless of their other characteristics). What we're trying to do here for Azure is to allow two node groups to be "similar" if even if their node pool label is _ different_, given that everything else matches. The reason for doing this is to allow for agent pools that have the same configuration to be considered as similar even if their "agentpool" label differs (since they are in different pools). @feiskyer thoughts? |
cluster-autoscaler/processors/nodegroupset/compare_nodegroups.go
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
@MaciekPytel @losipiuk Could you help to take a look at this PR again? |
@MaciekPytel @losipiuk please take a look. |
Makes sense but we cannot explicitly call Azure CP from main as we have build tags which allow building without Azure code at all. Could you think of way to extend CP interface to hide such CP specific customization behind it? Sth like CP.setupProcessors(originalProcessors). This would allow to replace/wrap/extend processors set as needed by CP. |
#2171 is working on #2094 (comment), so close this and prefer #2171. /close |
@feiskyer: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Related to #2044
Currently, the cluster-autoscaler never puts two different agent pools in the scale-up plan when used with Azure clusters deployed with aks-engine or AKS. Each node has an "agentpool" label that identifies its node pool. This PR adds the "agentpool" label to the ignored labels of
IsNodeInfoSimilar
in order to allow two similar node groups with different "agentpool" labels to be detected.Also added documentation on using
--balance-similar-node-groups
in the Azure README.