Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
428 commits
Select commit Hold shift + click to select a range
d9c0c74
Add 2 missing CA flags to param docs
gjtempleton Oct 24, 2018
390a8d1
Merge pull request #1347 from gjtempleton/CA-Param-Docs-Cleanup
k8s-ci-robot Oct 25, 2018
c99b4d1
Add factory for ClusterFeedProvider
anjensan Oct 24, 2018
6e3e9ad
Add factory for Recommender
anjensan Oct 23, 2018
8713582
Move creation of PodLister & oom.Observer into separate func
anjensan Oct 24, 2018
c185529
Merge pull request #1344 from anjensan/component-factories
k8s-ci-robot Oct 25, 2018
6f5e6aa
Move node group balancing to processor
MaciekPytel Oct 23, 2018
283300e
Add informational UncappedTarged field to VPA api.
bskiba Oct 25, 2018
3bad45b
Merge pull request #1349 from bskiba/upcappedTarget
k8s-ci-robot Oct 25, 2018
1894590
Fix Fatalf format string
justinsb Oct 25, 2018
01a56a8
Add GKE-specific NodeGroupSet processor
MaciekPytel Oct 25, 2018
3880db9
Merge pull request #1352 from justinsb/fix_fmt_string
k8s-ci-robot Oct 25, 2018
1b50ebd
Update FAQ on overprovisioning to account for k8s 1.11
MrSaints Oct 26, 2018
4b7637b
Merge pull request #1355 from MrSaints/faq-overprovisioning-in-1.11
k8s-ci-robot Oct 26, 2018
41b0287
NodeGroup.Nodes() return Instance struct instead instance name
losipiuk Oct 23, 2018
f980934
Merge pull request #1340 from losipiuk/lukaszos/nodegroup-nodes-retur…
k8s-ci-robot Oct 26, 2018
21f76e1
Recommender capps recommendation according to policy.
bskiba Oct 25, 2018
eab0c09
gce: increase test timeout in TestWaitForOp
frobware Oct 24, 2018
f341d8a
Merge pull request #1341 from MaciekPytel/gke_nodegroup_balancing
k8s-ci-robot Oct 26, 2018
74e46ff
Merge pull request #1354 from bskiba/correctCapping
k8s-ci-robot Oct 26, 2018
4a672d5
Merge pull request #1356 from frobware/fix-test-flake-in-TestWaitForOp
k8s-ci-robot Oct 26, 2018
dd1478e
Modify execution_latency_seconds buckets
bskiba Oct 26, 2018
73e1a12
Merge pull request #1357 from bskiba/fix-metric-buckets
k8s-ci-robot Oct 26, 2018
f1b0850
Fix broken link to VPA Admission Webhook readme
keithlayne Oct 29, 2018
3949ff9
Merge pull request #1360 from keithlayne/patch-1
k8s-ci-robot Oct 29, 2018
e462d44
Extract Backoff interface
losipiuk Oct 26, 2018
cd6566e
Pass on-event oomInfo without creating a new goroutine.
kgolab Oct 30, 2018
c8533f4
Merge pull request #1361 from kgolab/vpa-oom-sync
k8s-ci-robot Oct 30, 2018
4754ecc
add alibaba cloud doc link
ringtail Oct 30, 2018
2ff85f8
Protect against negative totalWeight values
kgolab Oct 30, 2018
3b204e9
Merge pull request #1364 from kgolab/vpa-safe-subtract
k8s-ci-robot Oct 30, 2018
c4348d6
Use real-usage sample (exclude previous OOMs) to estimate memory usag…
kgolab Oct 30, 2018
32bff90
Merge pull request #1362 from AliyunContainerService/feature/add-doc-…
k8s-ci-robot Oct 30, 2018
63f32b9
Merge pull request #1363 from kgolab/vpa-oom-run-away
k8s-ci-robot Oct 30, 2018
55fc1e2
Store NodeGroup in ScaleUpRequest and ScaleDownRequest
losipiuk Oct 30, 2018
0e2c373
Use NodeGroup as key in Backoff
losipiuk Oct 30, 2018
730d00d
CI verify missing negation
multi-io Oct 30, 2018
38b6b33
Merge pull request #1365 from multi-io/verify-all-negation-fix
k8s-ci-robot Oct 31, 2018
3faf483
Vertical Pod Autoscaler version 0.3.0
bskiba Nov 2, 2018
81c2888
Merge pull request #1373 from bskiba/vpa-0.3
k8s-ci-robot Nov 2, 2018
6f1c28e
Fix typos: alredy -> already
mooncak Nov 4, 2018
dd42385
Merge pull request #1374 from mooncak/fix_typos
k8s-ci-robot Nov 4, 2018
e26021a
hack/verify-all.sh -v output corrected
multi-io Nov 4, 2018
7e56847
Add e2e tests for updater with different controllers.
bskiba Nov 2, 2018
f1aa190
Merge pull request #1372 from bskiba/e2e-controllers
k8s-ci-robot Nov 5, 2018
786c61f
Build VPA releases in docker
bskiba Nov 5, 2018
da6f785
Relax list of OWNERS in hack directory
losipiuk Nov 6, 2018
ab79a01
Merge pull request #1379 from losipiuk/lukaszos/relax-list-of-owners-…
k8s-ci-robot Nov 6, 2018
0c49666
Merge pull request #1375 from multi-io/verify-all-output-fix
k8s-ci-robot Nov 6, 2018
bf6ff4b
Clean up estimators
aleksandra-malinowska Jul 19, 2018
d9f804c
Mark BasicEstimator as deprecated
aleksandra-malinowska Jul 20, 2018
5338aa1
Merge pull request #1109 from aleksandra-malinowska/refactor-cp-9
k8s-ci-robot Nov 6, 2018
6febc1d
Fix formatted log messages
aleksandra-malinowska Nov 6, 2018
f7eff81
Merge pull request #1381 from aleksandra-malinowska/fix-logging
k8s-ci-robot Nov 6, 2018
1267ee7
Clean up GkeManager interface
aleksandra-malinowska Aug 14, 2018
08d2bfd
Merge pull request #1336 from aleksandra-malinowska/gce-api-metrics
k8s-ci-robot Nov 6, 2018
7cb0b12
Merge pull request #1149 from aleksandra-malinowska/refactor-cp-22
k8s-ci-robot Nov 6, 2018
0dd14af
Update godeps
losipiuk Nov 6, 2018
2a52c19
Fix usage of k8s quota API
losipiuk Nov 6, 2018
dc8ba48
Merge pull request #1385 from losipiuk/lo/update-godeps-master
k8s-ci-robot Nov 6, 2018
65297df
Update RBAC example to include replicasets in the apps apigroup
zegl Nov 6, 2018
14120a4
AWS: Improved balancing
johanneswuerbach Nov 5, 2018
70f88b7
Correct flag for node group auto-discovery
lachlancooper Nov 7, 2018
b18ef8b
Merge pull request #1388 from lachlancooper/asg-tag-fix
k8s-ci-robot Nov 7, 2018
7008fb5
Merge pull request #1380 from losipiuk/lo/backoff
k8s-ci-robot Nov 7, 2018
5aebe55
Merge pull request #1376 from bskiba/build-in-docker
k8s-ci-robot Nov 7, 2018
85222a3
Add test checking Initial and Off are handled correctly
bskiba Nov 6, 2018
f3dcbe1
Merge pull request #1383 from bskiba/e2e-modes
k8s-ci-robot Nov 8, 2018
ed2cf2a
Merge pull request #1386 from zegl/example-rbac-apps-replicasets
k8s-ci-robot Nov 8, 2018
de3f4e1
Fix typos: checkponits -> checkpoints
mooncak Nov 8, 2018
6a574bc
Merge pull request #1394 from mooncak/fix_bug
k8s-ci-robot Nov 9, 2018
cc46baa
Distinguish between eviction and restart in actuation e2e
bskiba Nov 8, 2018
6f84848
Merge pull request #1393 from bskiba/fix-initial
k8s-ci-robot Nov 9, 2018
8125495
Fix typos: reqest->request, approporiate->appropriate
mooncak Nov 10, 2018
be75d19
Merge pull request #1396 from mooncak/fix_typo_issue
k8s-ci-robot Nov 13, 2018
2e10a71
Clarify down-scale capabilities
nikopen Nov 13, 2018
5962354
Inject Backoff instance to ClusterStateRegistry on creation
losipiuk Nov 13, 2018
5956345
Merge pull request #1400 from losipiuk/lo/pass-backoff
k8s-ci-robot Nov 13, 2018
2b327f4
Merge pull request #1399 from nikopen/patch-1
k8s-ci-robot Nov 13, 2018
a110adf
fix typo: posistive -> positive
SataQiu Nov 15, 2018
d4a6664
Merge pull request #1404 from SataQiu/fix-20181115
k8s-ci-robot Nov 15, 2018
13ac1ee
Changing permissions on "update-gofmt" script
Nov 2, 2018
f1a1121
Merge pull request #1408 from Rajat-0/access
k8s-ci-robot Nov 16, 2018
3b7aa30
Add replicasets in the apps apigroup
feiskyer Nov 19, 2018
1d4f3cd
Upgrade AKS API for v19.1.1 to fix AAD bug
feiskyer Nov 19, 2018
5541e5c
Upgrade Azure compute APIs
feiskyer Nov 19, 2018
14bd6d6
Update vendors
feiskyer Nov 19, 2018
e28f776
Merge pull request #1417 from feiskyer/azure-aad-fix-master
k8s-ci-robot Nov 19, 2018
6a74213
Merge pull request #1418 from feiskyer/fix-replicasets
k8s-ci-robot Nov 19, 2018
1210fc6
Merge pull request #1378 from johanneswuerbach/improve-balancing
k8s-ci-robot Nov 19, 2018
f5b6ff6
Update cluster-autoscaler version to 1.13.0-alpha.1
losipiuk Nov 19, 2018
10bc36d
Merge pull request #1419 from losipiuk/lo/1.13-beta-1
k8s-ci-robot Nov 19, 2018
92f498b
Run feature gates based logic to fix consistency of CA and scheduler
losipiuk Nov 20, 2018
169a5e9
Merge pull request #1434 from losipiuk/lukaszos/run-feature-gates-bas…
k8s-ci-robot Nov 22, 2018
4724c03
Update godeps
losipiuk Nov 26, 2018
be1d337
Use k8s.io/klog instead github.com/golang/glog
losipiuk Nov 26, 2018
3ad98c9
Update go version used from 1.10.2 to 1.11.2 to match one used by k8s
losipiuk Nov 26, 2018
6538f87
Fix gofmt errors
losipiuk Nov 26, 2018
a293569
Merge pull request #1444 from losipiuk/cluster-autoscaler-release-1.1…
losipiuk Nov 26, 2018
68aa322
Cluster Autoscaler release 1.13.0-rc.1
losipiuk Nov 26, 2018
7d42c40
Merge pull request #1448 from losipiuk/lo/1.13rc1
k8s-ci-robot Nov 26, 2018
0bbd2c0
Update AWS EC2 instance type catalog
Nov 19, 2018
d6b164b
Report ASGs using unknown AWS EC2 instance types
Nov 19, 2018
983adba
Test the (*AwsManager).getAsgTemplate method
Nov 20, 2018
f465ada
add flags to ignore daemonsets and mirror pods when calculating resou…
awprice Nov 16, 2018
6af894b
Initialize klog
losipiuk Nov 26, 2018
ab47574
Cluster Autoscaler release 1.13.0-rc.2
losipiuk Nov 26, 2018
6ffe3e7
Merge pull request #1450 from losipiuk/lo/cherry-picks-ca-13
k8s-ci-robot Nov 27, 2018
db609e0
Cluster Autoscaler release 1.13.0
losipiuk Nov 28, 2018
a38922d
Merge pull request #1460 from losipiuk/ca-release-1.13
losipiuk Nov 28, 2018
4f81c04
Update base debian image for Cluster Autoscaler
losipiuk Dec 5, 2018
fd90f4c
Merge pull request #1480 from losipiuk/lo/use-base-debain-0.4.0-1.13
k8s-ci-robot Dec 5, 2018
6cb1f48
Update Cluster Autoscaler version to 1.13.1
losipiuk Dec 7, 2018
6402c46
Merge pull request #1486 from losipiuk/lo/ca-1.13.1
k8s-ci-robot Dec 7, 2018
0020401
Update Debian base image version to 0.4.0 in Makefile
losipiuk Dec 7, 2018
5e62f60
Merge pull request #1487 from losipiuk/lo/debian-image-0.4-makefile-1.13
k8s-ci-robot Dec 7, 2018
a569471
Keep one place where default base image for Cluster Austoscaler is de…
losipiuk Dec 7, 2018
e9e2944
Merge pull request #1489 from losipiuk/lo/base-image-cleanup-1.13
losipiuk Dec 7, 2018
8197ea1
Cherry-pick of #1485: Fix aws flaking Unit Tests in 1.13
Jeffwan Dec 7, 2018
a764da8
Merge pull request #1493 from Jeffwan/aws-flakinng-ut-fix-1.13
k8s-ci-robot Dec 10, 2018
a3b0f6d
Add cache for resource IDs of vmss instances
feiskyer Dec 11, 2018
c8e97b7
Merge pull request #1523 from feiskyer/1.13-fix
k8s-ci-robot Dec 21, 2018
836a7c7
Cherry-pick of #1550: Pass nodeGroup->NodeInfo map to ClusterStateReg…
losipiuk Jan 2, 2019
2e6319d
Cherry-pick of #1643: Account for kernel reserved memory in capacity …
jkaniuk Jan 30, 2019
c7e9815
Cherry-pick of #1643: Cache exemplar ready node for each node group
jkaniuk Feb 1, 2019
c70f0d9
Merge pull request #1695 from jkaniuk/capacity-prediction-1.13
k8s-ci-robot Feb 15, 2019
a9520b8
Fix windows name parsing for Azure VMAS nodes
feiskyer Feb 1, 2019
04a5f1d
Fix error message for long-waiting operations
feiskyer Feb 19, 2019
539e27c
Update overrides for go-autorest
feiskyer Feb 21, 2019
2b8d82a
no_vendor
feiskyer Feb 26, 2019
9d6e632
Update godeps
feiskyer Feb 26, 2019
de41d6f
no_vendor
feiskyer Feb 26, 2019
87c934e
Update godeps
feiskyer Feb 26, 2019
449c654
Merge pull request #1731 from feiskyer/cluster-autoscaler-release-1.13
k8s-ci-robot Feb 26, 2019
fba8523
Add GetInstanceID interface for cloudprovider
feiskyer Feb 27, 2019
b2023c2
Implement GetInstanceID for Azure and make instanceID to lower cases
feiskyer Feb 27, 2019
5953c8e
Implement GetInstanceID for other cloud providers
feiskyer Feb 27, 2019
c85fad7
Use cloudProvider.GetInstanceID() to get unregistered nodes
feiskyer Feb 27, 2019
fe369e0
Convert virtualMachineRE to lower cases
feiskyer Mar 6, 2019
d516464
Merge pull request #1755 from feiskyer/cluster-autoscaler-release-1.13
k8s-ci-robot Mar 6, 2019
837a8fd
Update debian-base image to 0.4.1
losipiuk Mar 6, 2019
6778c6e
Merge pull request #1762 from losipiuk/lo/ca-1.13-debian-base-0.4.1
k8s-ci-robot Mar 6, 2019
b517199
Cluster Autoscaler 1.13.2
losipiuk Mar 7, 2019
dc15278
Merge pull request #1766 from losipiuk/lo/ca-1.13.2
k8s-ci-robot Mar 7, 2019
8e34d4c
update image version to 1.13.2
ismailyenigul Mar 13, 2019
25812b7
Update cluster-autoscaler-multi-asg.yaml
ismailyenigul Mar 13, 2019
5bd0931
Update cluster-autoscaler-one-asg.yaml
ismailyenigul Mar 13, 2019
584b859
update image version to 1.13.2
ismailyenigul Mar 13, 2019
e9a81cf
Merge pull request #1789 from ismailyenigul/cluster-autoscaler-releas…
k8s-ci-robot Mar 13, 2019
fa9e90a
Use debian-base-amd64:v1.0.0
losipiuk Mar 25, 2019
d54edf1
Merge pull request #1831 from losipiuk/lo/cluster-autoscaler-release-…
k8s-ci-robot Mar 26, 2019
24a7afa
Rebase to upstream d54edf1888
smarterclayton Mar 29, 2019
241bfe4
UPSTREAM: <carry>: openshift: Add dockerfile for cluster autoscaler.
Apr 18, 2018
f225c0b
UPSTREAM: <carry>: openshift: Add openshift/release Makefile and hack…
ingvagabund Apr 20, 2018
a20965b
UPSTREAM: <carry>: openshift: Fix the spec and hack scripts so the pa…
ingvagabund Apr 24, 2018
8508469
UPSTREAM: <carry>: openshift: Bump embedded tools
smarterclayton Jun 7, 2018
348862b
UPSTREAM: <carry>: openshift: Fix spec file to be consistent
smarterclayton Jun 9, 2018
d82dea5
UPSTREAM: <carry>: openshift: cluster-autoscaler.spec: bump golang_ve…
frobware Oct 25, 2018
238ec00
UPSTREAM: <carry>: openshift: cluster-autoscaler.spec: set golang_ver…
frobware Oct 26, 2018
ea9aaf0
UPSTREAM: <carry>: openshift: vendor sigs.k8s.io/cluster-api
frobware Oct 25, 2018
9618d9f
UPSTREAM: <carry>: openshift: initial cluster-api provider implementa…
frobware Oct 25, 2018
60f8dad
UPSTREAM: <carry>: openshift: Add a RHEL7 dockerfile and standarize f…
smarterclayton Nov 11, 2018
0385c0e
UPSTREAM: <carry>: openshift: cluster-api switch to annotations
frobware Nov 16, 2018
db3cae4
UPSTREAM: <carry>: openshift: vendor sigs.k8s.io/cluster-api/pkg/clie…
frobware Nov 25, 2018
149a60e
UPSTREAM: <carry>: openshift: Switch to informers for observing machi…
frobware Oct 30, 2018
db0361a
UPSTREAM: <carry>: openshift: Move min/max constants to clusterapi_ut…
frobware Nov 27, 2018
5cee8b1
UPSTREAM: <carry>: openshift: Additional logging in provider.NodeGrou…
frobware Nov 27, 2018
79f3b79
UPSTREAM: <carry>: openshift: Rename clusterController to machineCont…
frobware Nov 27, 2018
9001b91
UPSTREAM: <carry>: openshift: Use NamespaceAll in lieu of ""
frobware Nov 27, 2018
cfb3332
UPSTREAM: <carry>: openshift: Rename field provider.clusterapi to clu…
frobware Nov 27, 2018
98adfe0
UPSTREAM: <carry>: openshift: gce: increase test timeout in TestWaitF…
frobware Nov 27, 2018
cacefdb
UPSTREAM: <carry>: openshift: Use 'machine' for machine parameter names
frobware Nov 27, 2018
d8ee980
UPSTREAM: <carry>: openshift: Decouple nodegroup via dependency injec…
frobware Nov 27, 2018
e75c6b8
UPSTREAM: <carry>: openshift: handle nil nodeGroup in calculateScaleD…
frobware Dec 1, 2018
1d4fe22
UPSTREAM: <carry>: openshift: Use node.Spec.ProviderID instead of nod…
frobware Nov 30, 2018
30d4bdb
UPSTREAM: <carry>: openshift: fix calculation of max cluster size
frobware Dec 18, 2018
40e503c
UPSTREAM: <carry>: openshift: utils: add unit tests for clusterapi_ut…
frobware Dec 8, 2018
3b77fa7
UPSTREAM: <carry>: openshift: Remove obsolete usage of "machine" anno…
frobware Jan 7, 2019
976f127
UPSTREAM: <carry>: openshift: vendor sigs.k8s.io/cluster-api/pkg/clie…
frobware Jan 10, 2019
d7666c0
UPSTREAM: <carry>: openshift: Add unit test for findMachine()
frobware Jan 10, 2019
97257b9
UPSTREAM: <carry>: openshift: Add unit test for findNodeByNodeName()
frobware Jan 10, 2019
008bc6c
UPSTREAM: <carry>: openshift: Add unit test for findMachineOwner()
frobware Jan 10, 2019
16794c7
UPSTREAM: <carry>: openshift: Add unit test for findMachineByNodeProv…
frobware Jan 10, 2019
8ee33bb
UPSTREAM: <carry>: openshift: Add unit test for MachinesInMachineSet()
frobware Jan 10, 2019
a74456e
UPSTREAM: <carry>: openshift: Remove machineController.MachineSets() …
frobware Jan 10, 2019
d0fc2bd
UPSTREAM: <carry>: openshift: Rename ProviderConfig to ProviderSpec
ingvagabund Jan 11, 2019
5e0f093
UPSTREAM: <carry>: openshift: tests: copy autoscaler e2e test
paulfantom Jan 18, 2019
f0ac98a
UPSTREAM: <carry>: openshift: tests: add dependencies
paulfantom Jan 23, 2019
e880421
UPSTREAM: <carry>: openshift: assign ownership to cloud team
paulfantom Jan 24, 2019
f877a78
UPSTREAM: <carry>: openshift: cloudprovider: pivot to machine.openshi…
frobware Jan 29, 2019
fd106cb
UPSTREAM: <carry>: openshift: test/openshift/e2e: pivot to machine.op…
frobware Feb 4, 2019
5bd3dff
UPSTREAM: <carry>: openshift: test/openshift/e2e: vendor github.com/o…
frobware Feb 4, 2019
79275e3
UPSTREAM: <carry>: openshift: test/openshift/e2e: fix formatting dire…
frobware Feb 4, 2019
95f27ca
UPSTREAM: <carry>: openshift: test/openshift/e2e: correct APIVersion …
frobware Feb 7, 2019
3524b3f
UPSTREAM: <carry>: openshift: test/openshift/e2e: really correct API …
frobware Feb 7, 2019
82206fa
UPSTREAM: <carry>: openshift: Scope err variables in calls to PollImm…
frobware Feb 8, 2019
2635d08
UPSTREAM: <carry>: openshift: Rework test to also scale down
frobware Feb 8, 2019
c999764
UPSTREAM: <carry>: openshift: Cap MaxReplicas to 2
frobware Feb 8, 2019
eaac1dd
UPSTREAM: <carry>: openshift: test/openshift/e2e: switch namespace to…
frobware Feb 8, 2019
77cc805
UPSTREAM: <carry>: openshift: Add newWorkLoad() helper function
frobware Feb 11, 2019
02e7379
UPSTREAM: <carry>: openshift: Setup signal handler for test cleanup
frobware Feb 11, 2019
77dfcd9
UPSTREAM: <carry>: openshift: vendor: sigs.k8s.io/controller-runtime/…
frobware Feb 11, 2019
bdc7694
UPSTREAM: <carry>: openshift: Add MachineDeployment informers to cont…
frobware Feb 9, 2019
8db3b9c
UPSTREAM: <carry>: openshift: Revert to whitebox testing
frobware Feb 9, 2019
bf75df6
UPSTREAM: <carry>: openshift: Add interface ScalableResource
frobware Feb 9, 2019
7177332
UPSTREAM: <carry>: openshift: MachineSet implementation of ScalableRe…
frobware Feb 9, 2019
b653560
UPSTREAM: <carry>: openshift: MachineDeployment implementation of Sca…
frobware Feb 9, 2019
a6ddbf3
UPSTREAM: <carry>: openshift: add unit test for utils
frobware Feb 9, 2019
a925940
UPSTREAM: <carry>: openshift: add unit test for controller
frobware Feb 9, 2019
252b8b5
UPSTREAM: <carry>: openshift: add unit test for provider
frobware Feb 9, 2019
c0f8bc9
UPSTREAM: <carry>: openshift: add unit test for nodegroup
frobware Feb 9, 2019
3d46d33
UPSTREAM: <carry>: openshift: openshiftmachineapi: add feature gate f…
frobware Feb 11, 2019
fdd32f7
UPSTREAM: <carry>: openshift: address review comments
frobware Feb 13, 2019
cc28758
UPSTREAM: <carry>: openshift: log nodegroup discovery at level 4
frobware Feb 14, 2019
eebf088
UPSTREAM: <carry>: openshift: test/openshift/e2e: bump dependencies
frobware Feb 14, 2019
858cae9
UPSTREAM: <carry>: openshift: validate node membership in DeleteNodes()
frobware Feb 12, 2019
d49603d
UPSTREAM: <carry>: openshift: test/openshift/e2e: don't modify replic…
frobware Feb 14, 2019
047a5bc
UPSTREAM: <carry>: openshift: create utility functions
frobware Feb 22, 2019
fb5c757
UPSTREAM: <carry>: openshift: return no nodegroup when scaling bounds…
frobware Feb 22, 2019
8f587cc
UPSTREAM: <carry>: openshift: Remove old test functions
frobware Feb 22, 2019
2691377
UPSTREAM: <carry>: openshift: openshiftmachineapi: remove unused fields
frobware Feb 23, 2019
2c4ad1d
UPSTREAM: <carry>: openshift: test/openshift: Revendor for cluster-ap…
enxebre Feb 28, 2019
3fff425
UPSTREAM: <carry>: openshift: test/openshift: Run e2e ginkgo suite
enxebre Feb 28, 2019
bfa8399
UPSTREAM: <carry>: openshift: fix max cluster size calculation on sca…
frobware Mar 7, 2019
a0e62c6
UPSTREAM: <carry>: openshift: test/openshift/e2e: add Autoscaler focu…
frobware Mar 11, 2019
5583e92
UPSTREAM: <carry>: openshift: test/openshift: go get dep if it doesn'…
frobware Mar 12, 2019
f0ed959
UPSTREAM: <carry>: openshift: test/openshift/Makefile: add rule to bu…
frobware Mar 12, 2019
6328132
UPSTREAM: <carry>: openshift: bump cluster-api-actuator-pkg
frobware Mar 12, 2019
2ba7b37
UPSTREAM: <carry>: openshift: remove TODO
frobware Mar 13, 2019
bfe94f7
UPSTREAM: <carry>: openshift: Rework TestNodeGroupNewNodeGroup
frobware Mar 14, 2019
047f3ed
UPSTREAM: <carry>: openshift: Rework TestNodeGroupResize
frobware Mar 15, 2019
6130938
UPSTREAM: <carry>: openshift: Rework TestNodeGroupDeleteNodes
frobware Mar 16, 2019
040f1fb
UPSTREAM: <carry>: openshift: Rework TestControllerNodeGroups
frobware Mar 17, 2019
05bdca0
UPSTREAM: <carry>: openshift: Rework TestControllerFindMachineByID
frobware Mar 17, 2019
8ed019f
UPSTREAM: <carry>: openshift: Rework utils test funcs
frobware Mar 17, 2019
e5639f7
UPSTREAM: <carry>: openshift: Rework TestControllerNodeGroupForNodeLo…
frobware Mar 17, 2019
3938685
UPSTREAM: <carry>: openshift: remove t.Helper() in test helpers
frobware Mar 20, 2019
73bf7dc
UPSTREAM: <carry>: openshift: remove unused makeMachineOwner()
frobware Mar 20, 2019
f12e696
UPSTREAM: <carry>: openshift: force test namespace ToLower()
frobware Mar 20, 2019
9918b49
UPSTREAM: <carry>: openshift: add spec to clusterTestConfig
frobware Mar 20, 2019
cccae89
UPSTREAM: <carry>: openshift: simplify TestProviderConstructorProperties
frobware Mar 20, 2019
6a11603
UPSTREAM: <carry>: openshift: remove parallel tests
frobware Mar 20, 2019
691b11a
UPSTREAM: <carry>: openshift: move all test utility functions
frobware Mar 20, 2019
4f16b03
UPSTREAM: <carry>: openshift: check for explicit errors in resize tests
frobware Mar 20, 2019
f627421
UPSTREAM: <carry>: openshift: create git history verification script
paulfantom Mar 21, 2019
03109e7
UPSTREAM: <carry>: openshift: add unit test TestNodeGroupIncreaseSize
frobware Mar 26, 2019
3df1ca4
UPSTREAM: <carry>: openshift: add unit test TestNodeGroupDecreaseTarg…
frobware Mar 26, 2019
3a74d1f
UPSTREAM: <carry>: openshift: cloudprovider updates for 1.13
frobware Mar 27, 2019
2cf67fd
UPSTREAM: <carry>: openshift: cloudprovider builder updates for 1.13
frobware Mar 27, 2019
3ffad39
UPSTREAM: <carry>: openshift: expose KubeConfigPath in CA options
frobware Mar 27, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
1 change: 0 additions & 1 deletion .release

This file was deleted.

2 changes: 1 addition & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ services:
language: go

go:
- 1.10.2
- 1.11.2

before_install:
- sudo apt-get install libseccomp-dev -qq
Expand Down
14 changes: 1 addition & 13 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -114,23 +114,11 @@ build-rpms:
.PHONY: build-rpms

# Build images from the official RPMs
#
#
# Args:
#
# Example:
# make build-images
build-images: build-rpms
hack/build-images.sh
.PHONY: build-images

.PHONY: lint
lint: ## Go lint your code
hack/go-lint.sh -min_confidence 0.9 ./cluster-autoscaler/cloudprovider/openshiftmachineapi/...

.PHONY: fmt
fmt: ## Go fmt your code
hack/go-fmt.sh ./cluster-autoscaler/cloudprovider/openshiftmachineapi

.PHONY: vet
vet: ## Go fmt your code
hack/go-vet.sh ./cluster-autoscaler/cloudprovider/openshiftmachineapi
2 changes: 1 addition & 1 deletion addon-resizer/Godeps/Godeps.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion addon-resizer/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,7 @@ name of your addon, for example:
ADDON_NAME=heapster
```

Currently Addon Resizer is used to scale addons: `heapster`, `metrics-server`.
Currently Addon Resizer is used to scale addons: `heapster`, `metrics-server`, `kube-state-metrics`.

### Overview

Expand Down
2 changes: 1 addition & 1 deletion builder/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.

FROM golang:1.10.2
FROM golang:1.11.2
LABEL maintainer="Marcin Wielgus <[email protected]>"

ENV GOPATH /gopath/
Expand Down
4 changes: 2 additions & 2 deletions cluster-autoscaler/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,12 +12,12 @@
# See the License for the specific language governing permissions and
# limitations under the License.

ARG BASEIMAGE=k8s.gcr.io/debian-base-amd64:0.3.2
ARG BASEIMAGE=k8s.gcr.io/debian-base-amd64:v1.0.0
FROM $BASEIMAGE
LABEL maintainer="Marcin Wielgus <[email protected]>"

ENV DEBIAN_FRONTEND noninteractive
RUN clean-install ca-certificates
RUN clean-install ca-certificates tzdata

ADD cluster-autoscaler cluster-autoscaler
ADD run.sh run.sh
Expand Down
95 changes: 84 additions & 11 deletions cluster-autoscaler/FAQ.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ this document:
* [How fast is HPA when combined with CA?](#how-fast-is-hpa-when-combined-with-ca)
* [Where can I find the designs of the upcoming features?](#where-can-i-find-the-designs-of-the-upcoming-features)
* [What are Expanders?](#what-are-expanders)
* [What are the parameters to CA?](#what-are-the-parameters-to-ca)
* [Troubleshooting](#troubleshooting)
* [I have a couple of nodes with low utilization, but they are not scaled down. Why?](#i-have-a-couple-of-nodes-with-low-utilization-but-they-are-not-scaled-down-why)
* [How to set PDBs to enable CA to move kube-system pods?](#how-to-set-pdbs-to-enable-ca-to-move-kube-system-pods)
Expand Down Expand Up @@ -309,14 +310,14 @@ to scale up the cluster.

The size of overprovisioned resources can be controlled by changing the size of pause pods and the
number of replicas. This way you can configure static size of overprovisioning resources (i.e. 2
additional cores). If we want to configure dynamic size (i.e. 20% of recources in the cluster)
additional cores). If we want to configure dynamic size (i.e. 20% of resources in the cluster)
then we need to use [Horizontal Cluster Proportional Autoscaler](https://github.com/kubernetes-incubator/cluster-proportional-autoscaler)
which will change number of pause pods depending on the size of the cluster. It will increase the
number of replicas when cluster grows and decrease the number of replicas if cluster shrinks.

Configuration of dynamic overprovisioning:

1. Enable priority preemption in your cluster. It can be done by exporting following env
1. (For 1.10, and below) Enable priority preemption in your cluster. It can be done by exporting following env
variables before executing kube-up (more details [here](https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/)):
```sh
export KUBE_RUNTIME_CONFIG=scheduling.k8s.io/v1alpha1=true
Expand All @@ -326,7 +327,9 @@ export ENABLE_POD_PRIORITY=true
2. Define priority class for overprovisioning pods. Priority -1 will be reserved for
overprovisioning pods as it is the lowest priority that triggers scaling clusters. Other pods need
to use priority 0 or higher in order to be able to preempt overprovisioning pods. You can use
following definitions:
following definitions.

**For 1.10, and below:**

```yaml
apiVersion: scheduling.k8s.io/v1alpha1
Expand All @@ -338,6 +341,18 @@ globalDefault: false
description: "Priority class used by overprovisioning."
```

**For 1.11:**

```
apiVersion: scheduling.k8s.io/v1beta
kind: PriorityClass
metadata:
name: overprovisioning
value: -1
globalDefault: false
description: "Priority class used by overprovisioning."
```

3. Change pod priority cutoff in CA to -10 so pause pods are taken into account during scale down
and scale up. Set flag ```expendable-pods-priority-cutoff``` to -10. If you already use priority
preemption then pods with priorities between -10 and -1 won't be best effort anymore.
Expand Down Expand Up @@ -442,14 +457,12 @@ Autoscaler expects requested nodes to appear within 15 minutes
(configured by `--max-node-provision-time` flag.) After this time, if they are
still unregistered, it stops considering them in simulations and may attempt to scale up a
different group if the pods are still pending. It will also attempt to remove
any nodes left unregistered after 15 minutes (configured by
`--unregistered-node-removal-time` flag.) For this reason, we strongly
recommend to set those flags to the same value.
any nodes left unregistered after this time.

### How does scale-down work?

Every 10 seconds (configurable by `--scan-interval` flag), if no scale-up is
needed, Cluster Autoscaler checks which nodes are unneeded. A node is considered for removal when:
needed, Cluster Autoscaler checks which nodes are unneeded. A node is considered for removal when **all** below conditions hold:

* The sum of cpu and memory requests of all pods running on this node is smaller
than 50% of the node's allocatable. (Before 1.1.0, node capacity was used
Expand Down Expand Up @@ -515,7 +528,13 @@ then this node group may be excluded from future scale-ups.

### How fast is Cluster Autoscaler?

By default, scale-up is considered up to 10 seconds after pod is marked as unschedulable, and scale-down 10 minutes after a node becomes unneeded. There are multiple flags which can be used to configure them. Assuming default settings, [SLOs described here apply](#what-are-the-service-level-objectives-for-cluster-autoscaler).
By default, scale-up is considered up to 10 seconds after pod is marked as unschedulable, and scale-down 10 minutes after a node becomes unneeded.
There are multiple flags which can be used to configure these thresholds. For example, in some environments, you may wish to give the k8s scheduler
a bit more time to schedule a pod than the CA's scan-interval. One way to do this is by setting `--new-pod-scale-up-delay`, which causes the CA to
ignore unschedulable pods until they are a certain "age", regardless of the scan-interval. If k8s has not scheduled them by the end of that delay,
then they may be considered by the CA for a possible scale-up.

Assuming default settings, [SLOs described here apply](#what-are-the-service-level-objectives-for-cluster-autoscaler).

### How fast is HPA when combined with CA?

Expand Down Expand Up @@ -585,6 +604,58 @@ would match the cluster size. This expander is described in more details

************

### What are the parameters to CA?

The following startup parameters are supported for cluster autoscaler:

| Parameter | Description | Default |
| --- | --- | --- |
| `cluster-name` | Autoscaled cluster name, if available | ""
| `address` | The address to expose prometheus metrics | :8085
| `kubernetes` | Kubernetes master location. Leave blank for default | ""
| `kubeconfig` | Path to kubeconfig file with authorization and master location information | ""
| `cloud-config` | The path to the cloud provider configuration file. Empty string for no configuration file | ""
| `namespace` | Namespace in which cluster-autoscaler run | "kube-system"
| `scale-down-enabled` | Should CA scale down the cluster | true
| `scale-down-delay-after-add` | How long after scale up that scale down evaluation resumes | 10 minutes
| `scale-down-delay-after-delete` | How long after node deletion that scale down evaluation resumes, defaults to scan-interval | scan-interval
| `scale-down-delay-after-failure` | How long after scale down failure that scale down evaluation resumes | 3 minutes
| `scale-down-unneeded-time` | How long a node should be unneeded before it is eligible for scale down | 10 minutes
| `scale-down-unready-time` | How long an unready node should be unneeded before it is eligible for scale down | 20 minutes
| `scale-down-utilization-threshold` | Node utilization level, defined as sum of requested resources divided by capacity, below which a node can be considered for scale down | 0.5
| `scale-down-non-empty-candidates-count` | Maximum number of non empty nodes considered in one iteration as candidates for scale down with drain<br>Lower value means better CA responsiveness but possible slower scale down latency<br>Higher value can affect CA performance with big clusters (hundreds of nodes)<br>Set to non positive value to turn this heuristic off - CA will not limit the number of nodes it considers." | 30
| `scale-down-candidates-pool-ratio` | A ratio of nodes that are considered as additional non empty candidates for<br>scale down when some candidates from previous iteration are no longer valid<br>Lower value means better CA responsiveness but possible slower scale down latency<br>Higher value can affect CA performance with big clusters (hundreds of nodes)<br>Set to 1.0 to turn this heuristics off - CA will take all nodes as additional candidates. | 0.1
| `scale-down-candidates-pool-min-count` | Minimum number of nodes that are considered as additional non empty candidates<br>for scale down when some candidates from previous iteration are no longer valid.<br>When calculating the pool size for additional candidates we take<br>`max(#nodes * scale-down-candidates-pool-ratio, scale-down-candidates-pool-min-count)` | 50
| `scan-interval` | How often cluster is reevaluated for scale up or down | 10 seconds
| `max-nodes-total` | Maximum number of nodes in all node groups. Cluster autoscaler will not grow the cluster beyond this number. | 0
| `cores-total` | Minimum and maximum number of cores in cluster, in the format <min>:<max>. Cluster autoscaler will not scale the cluster beyond these numbers. | 320000
| `memory-total` | Minimum and maximum number of gigabytes of memory in cluster, in the format <min>:<max>. Cluster autoscaler will not scale the cluster beyond these numbers. | 6400000
| `gpu-total` | Minimum and maximum number of different GPUs in cluster, in the format <gpu_type>:<min>:<max>. Cluster autoscaler will not scale the cluster beyond these numbers. Can be passed multiple times. CURRENTLY THIS FLAG ONLY WORKS ON GKE. | ""
| `cloud-provider` | Cloud provider type. | gce
| `max-empty-bulk-delete` | Maximum number of empty nodes that can be deleted at the same time. | 10
| `max-graceful-termination-sec` | Maximum number of seconds CA waits for pod termination when trying to scale down a node. | 600
| `max-total-unready-percentage` | Maximum percentage of unready nodes in the cluster. After this is exceeded, CA halts operations | 45
| `ok-total-unready-count` | Number of allowed unready nodes, irrespective of max-total-unready-percentage | 3
| `max-node-provision-time` | Maximum time CA waits for node to be provisioned | 15 minutes
| `nodes` | sets min,max size and other configuration data for a node group in a format accepted by cloud provider. Can be used multiple times. Format: <min>:<max>:<other...> | ""
| `node-group-auto-discovery` | One or more definition(s) of node group auto-discovery.<br>A definition is expressed `<name of discoverer>:[<key>[=<value>]]`<br>The `aws` and `gce` cloud providers are currently supported. AWS matches by ASG tags, e.g. `asg:tag=tagKey,anotherTagKey`<br>GCE matches by IG name prefix, and requires you to specify min and max nodes per IG, e.g. `mig:namePrefix=pfx,min=0,max=10`<br>Can be used multiple times | ""
| `estimator` | Type of resource estimator to be used in scale up | binpacking
| `expander` | Type of node group expander to be used in scale up. | random
| `write-status-configmap` | Should CA write status information to a configmap | true
| `max-inactivity` | Maximum time from last recorded autoscaler activity before automatic restart | 10 minutes
| `max-failing-time` | Maximum time from last recorded successful autoscaler run before automatic restart | 15 minutes
| `balance-similar-node-groups` | Detect similar node groups and balance the number of nodes between them | false
| `node-autoprovisioning-enabled` | Should CA autoprovision node groups when needed | false
| `max-autoprovisioned-node-group-count` | The maximum number of autoprovisioned groups in the cluster | 15
| `unremovable-node-recheck-timeout` | The timeout before we check again a node that couldn't be removed before | 5 minutes
| `expendable-pods-priority-cutoff` | Pods with priority below cutoff will be expendable. They can be killed without any consideration during scale down and they don't cause scale up. Pods with null priority (PodPriority disabled) are non expendable | 0
| `regional` | Cluster is regional | false
| `leader-elect` | Start a leader election client and gain leadership before executing the main loop.<br>Enable this when running replicated components for high availability | true
| `leader-elect-lease-duration` | The duration that non-leader candidates will wait after observing a leadership<br>renewal until attempting to acquire leadership of a led but unrenewed leader slot.<br>This is effectively the maximum duration that a leader can be stopped before it is replaced by another candidate.<br>This is only applicable if leader election is enabled | 15 seconds
| `leader-elect-renew-deadline` | The interval between attempts by the acting master to renew a leadership slot before it stops leading.<br>This must be less than or equal to the lease duration.<br>This is only applicable if leader election is enabled | 10 seconds
| `leader-elect-retry-period` | The duration the clients should wait between attempting acquisition and renewal of a leadership.<br>This is only applicable if leader election is enabled | 2 seconds
| `leader-elect-resource-lock` | The type of resource object that is used for locking during leader election.<br>Supported options are `endpoints` (default) and `configmaps` | "endpoints"

# Troubleshooting:

### I have a couple of nodes with low utilization, but they are not scaled down. Why?
Expand Down Expand Up @@ -655,10 +726,12 @@ Events:
Warning FailedScheduling .. default-scheduler No nodes are available that match all of the following predicates:: Insufficient cpu (4), NoVolumeZoneConflict (2)
```

This limitation will go away with
This limitation was solved with
[volume topological scheduling](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/storage/volume-topology-scheduling.md)
support in Kubernetes. Currently, we advice to set CA upper limits in a way to
allow for some slack capacity.
introduced as beta in Kubernetes 1.11 and planned for GA in 1.13.
To allow CA to take advantage of topological scheduling, use separate node groups per zone.
This way CA knows exactly which node group will create nodes in the required zone rather than relying on the cloud provider choosing a zone for a new node in a multi-zone node group.
When using separate node groups per zone, the `--balance-similar-node-groups` flag will keep nodes balanced across zones for workloads that dont require topological scheduling.

### CA doesn’t work, but it used to work yesterday. Why?

Expand Down
Loading