Merge upstream/cluster-autoscaler-release 1.13 #76

frobware · 2019-03-25T15:30:56Z

The PR was created by first taking upstream/cluster-autoscaler-release-1.13 as the base then applying UPSTREAM: <carry> patches on top. The set of patches applied was taken from:

$ git log --no-merges --format=oneline  upstream/cluster-autoscaler-release-1.13..openshift/master

To create the merge commit I have used the following steps:

$ git remote update
$ git checkout upstream/cluster-autoscaler-release-1.13
$ git checkout -b merge
$ git checkout openshift/master
$ echo 'merge upstream/cluster-autoscaler-release-1.13' | git commit-tree merge^{tree} -p HEAD -p merge
deadbeef12345678
$ git checkout deadbeef12345678
$ git cherry-pick ...all-the-things...

For details on the merge^{tree} syntax please read this documentation.

The result of git commit-tree in this PR is 810bb14

I then applied all the carry commits on top. There were additional changes required to the openshiftmachine cloud provider to make it build again. These changes are captured in the following commits: d4ecf32, 07edb0b and d175e89 and should be the focus of any review as all the other carry commits come from master.

This PR also revendors openshift/cluster-api based on openshift/cluster-api#20.

As we have not been 100% consistent with the naming of our carry commits I have added the UPSTREAM: <carry>: openshift prefix to those commits that did not have this convention. This will make it easier to identify carry commits for the next rebase.

Refactor - add factories for Recommender & ClusterFeedProvider

The goal is to allow customization of this logic for different use-case and cloudproviders.

Add informational UncappedTarged field to VPA api.

The error argument was omitted.

Also refactor Balancing processor a bit to make it easily extensible.

Fix Fatalf format string

In k8s 1.11, pod priority, and preemption is enabled by default. The API is under `v1beta`, and they do not need to be enabled.

…-in-1.11 Update FAQ on overprovisioning to account for k8s 1.11

This is preparatory work for handling resource related (stockout/quota-exceeded) error conditions in CA.

…odes-return-instance-struct-instead-instance-name-2063d NodeGroup.Nodes() return Instance struct instead instance name

On local hardware I have not seen this test fail using the current 50ms timeout. On AWS/CI I see this fail occasionally; I built and copied this test to an AWS node and run it repeatedly for ~1 hour. The min was 5ms and the max was 268ms, so bumping the timeout to 500ms. Signed-off-by: Andrew McDermott <[email protected]>

…ancing Move nodegroup balancing to processor, add GKE-specific implementation

Recommender capps recommendation according to policy.

…stWaitForOp gce: increase test timeout in TestWaitForOp

Modify execution_latency_seconds buckets

Fix broken link to VPA Admission Webhook readme

Pass on-event oomInfo without creating a new goroutine.

Protect against negative totalWeight values

…e after OOM.

…e/add-doc-link add alibaba cloud doc link

Use real-usage sample to estimate memory usage after OOM

Add comment to pass lint. Conflicts: cluster-autoscaler/cloudprovider/openshiftmachineapi/machineapi_provider.go

frobware · 2019-03-28T08:44:44Z

Old version had '"Failed to create %q cloud provider" in the logging string. Might be helpful in this file as well.

I don't see this in (openshift/master) machineapi_provider.go, but I do see it here:

$ ag "Failed to create %q cloud provider" 
cloudprovider/builder/cloud_provider_builder.go
230:            glog.Fatalf("Failed to create %q cloud provider: %v", name, err)

And it is true that it no longer exists in upstream/cluster-autoscaler-release-1.13.

aim@spicy:~/go-projects/autoscaler-merge/src/k8s.io/autoscaler
$ ag "Failed to create %q cloud provider"

As cloud_provider_builder.go is upstream code I'm inclined to leave it as is.

frobware · 2019-03-28T09:18:14Z

/hold cancel

frobware · 2019-03-28T09:53:27Z

/cc @derekwaynecarr @smarterclayton

frobware · 2019-03-28T10:26:17Z

/hold

Waiting for feature freeze exception to be granted.

/cc @enxebre

smarterclayton · 2019-03-28T13:41:02Z

I will take a look at the commit history and see whether this matches what I get if I recreate it. Want to make sure we end up with a documentable process

frobware · 2019-03-28T14:20:37Z

I will take a look at the commit history and see whether this matches what I get if I recreate it. Want to make sure we end up with a documentable process

I will also try your approach too as I think it uses less plumbing compared to git commit-tree.

smarterclayton · 2019-03-29T20:26:10Z

Ok, so tried my steps with #78 and grabbed these. Basically:

ran my steps using d54edf1 as the base instead of origin/master
after I was done, used git cherry-pick a02da6e...d175e89 to get yours
had to resolve cluster-autoscaler.spec not existing, everything else applied

What's different between mine and yours: 3ffad39..d175e89

Looks like makefile, vendor dir, hack, a few others. Were those things you explicitly had cherry-picks for? If not, this highlighted that those should also be their own cherrypicked commits.

frobware · 2019-03-29T22:43:51Z

Looks like makefile, vendor dir, hack, a few others. Were those things you explicitly had cherry-picks for? If not, this highlighted that those should also be their own cherrypicked commits.

I explicitly did not cherry-pick anything that was in cluster-autoscaler/vendor knowing full well that I was going to revendor to pick up latest openshift/cluster-api#20. There were three new commits that I made to make things compile again for 1.13 (cloudprovider API changes): d4ecf32, 07edb0b and d175e89

enxebre · 2019-04-03T18:27:59Z

/lgtm

frobware · 2019-04-03T18:35:01Z

/hold cancel

Exception was granted as long as it merges by Friday 5th April.

frobware · 2019-04-03T20:19:42Z

/test e2e-aws-operator

frobware · 2019-04-03T20:19:49Z

/refresh

frobware · 2019-04-03T21:22:10Z

level=warning msg="Found override for ReleaseImage. Please be warned, this is not advised"
level=info msg="Consuming \"Install Config\" from target directory"
level=info msg="Creating infrastructure resources..."
level=info msg="Waiting up to 30m0s for the Kubernetes API at https://api.ci-op-fmdrx7ht-1227b.origin-ci-int-aws.dev.rhcloud.com:6443..."
level=fatal msg="waiting for Kubernetes API: context deadline exceeded"

/retest

frobware · 2019-04-04T04:22:33Z

/retest

frobware · 2019-04-04T08:43:29Z

/retest

anjensan and others added 30 commits October 25, 2018 13:03

Move creation of PodLister & oom.Observer into separate func

8713582

Merge pull request kubernetes#1344 from anjensan/component-factories

c185529

Refactor - add factories for Recommender & ClusterFeedProvider

Move node group balancing to processor

6f5e6aa

The goal is to allow customization of this logic for different use-case and cloudproviders.

Add informational UncappedTarged field to VPA api.

283300e

Merge pull request kubernetes#1349 from bskiba/upcappedTarget

3bad45b

Add informational UncappedTarged field to VPA api.

Fix Fatalf format string

1894590

The error argument was omitted.

Add GKE-specific NodeGroupSet processor

01a56a8

Also refactor Balancing processor a bit to make it easily extensible.

Merge pull request kubernetes#1352 from justinsb/fix_fmt_string

3880db9

Fix Fatalf format string

Update FAQ on overprovisioning to account for k8s 1.11

1b50ebd

In k8s 1.11, pod priority, and preemption is enabled by default. The API is under `v1beta`, and they do not need to be enabled.

Merge pull request kubernetes#1355 from MrSaints/faq-overprovisioning…

4b7637b

…-in-1.11 Update FAQ on overprovisioning to account for k8s 1.11

NodeGroup.Nodes() return Instance struct instead instance name

41b0287

This is preparatory work for handling resource related (stockout/quota-exceeded) error conditions in CA.

Merge pull request kubernetes#1340 from losipiuk/lukaszos/nodegroup-n…

f980934

…odes-return-instance-struct-instead-instance-name-2063d NodeGroup.Nodes() return Instance struct instead instance name

Recommender capps recommendation according to policy.

21f76e1

Merge pull request kubernetes#1341 from MaciekPytel/gke_nodegroup_bal…

f341d8a

…ancing Move nodegroup balancing to processor, add GKE-specific implementation

Merge pull request kubernetes#1354 from bskiba/correctCapping

74e46ff

Recommender capps recommendation according to policy.

Merge pull request kubernetes#1356 from frobware/fix-test-flake-in-Te…

4a672d5

…stWaitForOp gce: increase test timeout in TestWaitForOp

Modify execution_latency_seconds buckets

dd1478e

Merge pull request kubernetes#1357 from bskiba/fix-metric-buckets

73e1a12

Modify execution_latency_seconds buckets

Fix broken link to VPA Admission Webhook readme

f1b0850

Merge pull request kubernetes#1360 from keithlayne/patch-1

3949ff9

Fix broken link to VPA Admission Webhook readme

Extract Backoff interface

e462d44

Pass on-event oomInfo without creating a new goroutine.

cd6566e

Merge pull request kubernetes#1361 from kgolab/vpa-oom-sync

c8533f4

Pass on-event oomInfo without creating a new goroutine.

add alibaba cloud doc link

4754ecc

Protect against negative totalWeight values

2ff85f8

Merge pull request kubernetes#1364 from kgolab/vpa-safe-subtract

3b204e9

Protect against negative totalWeight values

Use real-usage sample (exclude previous OOMs) to estimate memory usag…

c4348d6

…e after OOM.

Merge pull request kubernetes#1362 from AliyunContainerService/featur…

32bff90

…e/add-doc-link add alibaba cloud doc link

Merge pull request kubernetes#1363 from kgolab/vpa-oom-run-away

63f32b9

Use real-usage sample to estimate memory usage after OOM

UPSTREAM: <carry>: openshift: fix lint issue machineapi_provider

43d15a9

Add comment to pass lint. Conflicts: cluster-autoscaler/cloudprovider/openshiftmachineapi/machineapi_provider.go

openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 28, 2019

openshift-ci-robot requested review from derekwaynecarr and smarterclayton March 28, 2019 09:53

openshift-ci-robot requested a review from enxebre March 28, 2019 10:26

openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 28, 2019

frobware mentioned this pull request Apr 3, 2019

[WIP] openshift-4.1-cluster-autoscaler-1.13 #74

Closed

openshift-ci-robot assigned enxebre Apr 3, 2019

openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Apr 3, 2019

openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 3, 2019

openshift-merge-robot merged commit 6ed090c into openshift:master Apr 4, 2019

frobware mentioned this pull request Apr 4, 2019

UPSTREAM: <carry>: openshift: clean up test setup #80

Merged

danwinship mentioned this pull request Apr 9, 2019

Handling of vendor/.../OWNERS with rebases is annoying kubernetes/test-infra#12136

Closed

frobware deleted the merge-upstream-cluster-autoscaler-release-1.13 branch April 12, 2019 15:21

frobware mentioned this pull request Jun 19, 2019

Rebase to upstream/cluster-autoscaler-release-1.14 #107

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Merge upstream/cluster-autoscaler-release 1.13 #76

Merge upstream/cluster-autoscaler-release 1.13 #76

Uh oh!

frobware commented Mar 25, 2019 •

edited

Loading

Uh oh!

frobware commented Mar 28, 2019 •

edited

Loading

Uh oh!

frobware commented Mar 28, 2019

Uh oh!

frobware commented Mar 28, 2019

Uh oh!

frobware commented Mar 28, 2019

Uh oh!

smarterclayton commented Mar 28, 2019

Uh oh!

frobware commented Mar 28, 2019

Uh oh!

smarterclayton commented Mar 29, 2019

Uh oh!

frobware commented Mar 29, 2019 •

edited

Loading

Uh oh!

enxebre commented Apr 3, 2019

Uh oh!

frobware commented Apr 3, 2019

Uh oh!

frobware commented Apr 3, 2019

Uh oh!

frobware commented Apr 3, 2019

Uh oh!

frobware commented Apr 3, 2019

Uh oh!

frobware commented Apr 4, 2019

Uh oh!

frobware commented Apr 4, 2019

Uh oh!

Uh oh!

Merge upstream/cluster-autoscaler-release 1.13 #76

Merge upstream/cluster-autoscaler-release 1.13 #76

Uh oh!

Conversation

frobware commented Mar 25, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

frobware commented Mar 28, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

frobware commented Mar 28, 2019

Uh oh!

frobware commented Mar 28, 2019

Uh oh!

frobware commented Mar 28, 2019

Uh oh!

smarterclayton commented Mar 28, 2019

Uh oh!

frobware commented Mar 28, 2019

Uh oh!

smarterclayton commented Mar 29, 2019

Uh oh!

frobware commented Mar 29, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

enxebre commented Apr 3, 2019

Uh oh!

frobware commented Apr 3, 2019

Uh oh!

frobware commented Apr 3, 2019

Uh oh!

frobware commented Apr 3, 2019

Uh oh!

frobware commented Apr 3, 2019

Uh oh!

frobware commented Apr 4, 2019

Uh oh!

frobware commented Apr 4, 2019

Uh oh!

Uh oh!

frobware commented Mar 25, 2019 •

edited

Loading

frobware commented Mar 28, 2019 •

edited

Loading

frobware commented Mar 29, 2019 •

edited

Loading