Skip to content

Conversation

frobware
Copy link

@frobware frobware commented Mar 25, 2019

The PR was created by first taking upstream/cluster-autoscaler-release-1.13 as the base then applying UPSTREAM: <carry> patches on top. The set of patches applied was taken from:

$ git log --no-merges --format=oneline  upstream/cluster-autoscaler-release-1.13..openshift/master  

To create the merge commit I have used the following steps:

$ git remote update
$ git checkout upstream/cluster-autoscaler-release-1.13
$ git checkout -b merge
$ git checkout openshift/master
$ echo 'merge upstream/cluster-autoscaler-release-1.13' | git commit-tree merge^{tree} -p HEAD -p merge
deadbeef12345678
$ git checkout deadbeef12345678
$ git cherry-pick ...all-the-things...

For details on the merge^{tree} syntax please read this documentation.

The result of git commit-tree in this PR is 810bb14

I then applied all the carry commits on top. There were additional changes required to the openshiftmachine cloud provider to make it build again. These changes are captured in the following commits: d4ecf32, 07edb0b and d175e89 and should be the focus of any review as all the other carry commits come from master.

This PR also revendors openshift/cluster-api based on openshift/cluster-api#20.

As we have not been 100% consistent with the naming of our carry commits I have added the UPSTREAM: <carry>: openshift prefix to those commits that did not have this convention. This will make it easier to identify carry commits for the next rebase.

anjensan and others added 30 commits October 25, 2018 13:03
Refactor - add factories for Recommender & ClusterFeedProvider
The goal is to allow customization of this logic
for different use-case and cloudproviders.
Add informational UncappedTarged field to VPA api.
The error argument was omitted.
Also refactor Balancing processor a bit to make it easily extensible.
In k8s 1.11, pod priority, and preemption is enabled by default.
The API is under `v1beta`, and they do not need to be enabled.
…-in-1.11

Update FAQ on overprovisioning to account for k8s 1.11
This is preparatory work for handling resource related
(stockout/quota-exceeded) error conditions in CA.
…odes-return-instance-struct-instead-instance-name-2063d

NodeGroup.Nodes() return Instance struct instead instance name
On local hardware I have not seen this test fail using the current
50ms timeout. On AWS/CI I see this fail occasionally; I built and
copied this test to an AWS node and run it repeatedly for ~1 hour. The
min was 5ms and the max was 268ms, so bumping the timeout to 500ms.

Signed-off-by: Andrew McDermott <[email protected]>
…ancing

Move nodegroup balancing to processor, add GKE-specific implementation
Recommender capps recommendation according to policy.
…stWaitForOp

gce: increase test timeout in TestWaitForOp
Modify execution_latency_seconds buckets
Fix broken link to VPA Admission Webhook readme
Pass on-event oomInfo without creating a new goroutine.
Protect against negative totalWeight values
…e/add-doc-link

add alibaba cloud doc link
Use real-usage sample to estimate memory usage after OOM
Add comment to pass lint.

Conflicts:
	cluster-autoscaler/cloudprovider/openshiftmachineapi/machineapi_provider.go
@frobware
Copy link
Author

frobware commented Mar 28, 2019

Old version had '"Failed to create %q cloud provider" in the logging string. Might be helpful in this file as well.

I don't see this in (openshift/master) machineapi_provider.go, but I do see it here:

$ ag "Failed to create %q cloud provider" 
cloudprovider/builder/cloud_provider_builder.go
230:            glog.Fatalf("Failed to create %q cloud provider: %v", name, err)

And it is true that it no longer exists in upstream/cluster-autoscaler-release-1.13.

aim@spicy:~/go-projects/autoscaler-merge/src/k8s.io/autoscaler
$ ag "Failed to create %q cloud provider"

As cloud_provider_builder.go is upstream code I'm inclined to leave it as is.

@frobware
Copy link
Author

/hold cancel

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 28, 2019
@frobware
Copy link
Author

/cc @derekwaynecarr @smarterclayton

@frobware
Copy link
Author

/hold

Waiting for feature freeze exception to be granted.

/cc @enxebre

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 28, 2019
@smarterclayton
Copy link

I will take a look at the commit history and see whether this matches what I get if I recreate it. Want to make sure we end up with a documentable process

@frobware
Copy link
Author

I will take a look at the commit history and see whether this matches what I get if I recreate it. Want to make sure we end up with a documentable process

I will also try your approach too as I think it uses less plumbing compared to git commit-tree.

@smarterclayton
Copy link

Ok, so tried my steps with #78 and grabbed these. Basically:

  1. ran my steps using d54edf1 as the base instead of origin/master
  2. after I was done, used git cherry-pick a02da6e...d175e89 to get yours
  3. had to resolve cluster-autoscaler.spec not existing, everything else applied

What's different between mine and yours: 3ffad39..d175e89

Looks like makefile, vendor dir, hack, a few others. Were those things you explicitly had cherry-picks for? If not, this highlighted that those should also be their own cherrypicked commits.

@frobware
Copy link
Author

frobware commented Mar 29, 2019

Looks like makefile, vendor dir, hack, a few others. Were those things you explicitly had cherry-picks for? If not, this highlighted that those should also be their own cherrypicked commits.

I explicitly did not cherry-pick anything that was in cluster-autoscaler/vendor knowing full well that I was going to revendor to pick up latest openshift/cluster-api#20. There were three new commits that I made to make things compile again for 1.13 (cloudprovider API changes): d4ecf32, 07edb0b and d175e89

@enxebre
Copy link
Member

enxebre commented Apr 3, 2019

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Apr 3, 2019
@frobware
Copy link
Author

frobware commented Apr 3, 2019

/hold cancel

Exception was granted as long as it merges by Friday 5th April.

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 3, 2019
@frobware
Copy link
Author

frobware commented Apr 3, 2019

/test e2e-aws-operator

@frobware
Copy link
Author

frobware commented Apr 3, 2019

/refresh

@frobware
Copy link
Author

frobware commented Apr 3, 2019

level=warning msg="Found override for ReleaseImage. Please be warned, this is not advised"
level=info msg="Consuming \"Install Config\" from target directory"
level=info msg="Creating infrastructure resources..."
level=info msg="Waiting up to 30m0s for the Kubernetes API at https://api.ci-op-fmdrx7ht-1227b.origin-ci-int-aws.dev.rhcloud.com:6443..."
level=fatal msg="waiting for Kubernetes API: context deadline exceeded"

/retest

@frobware
Copy link
Author

frobware commented Apr 4, 2019

/retest

1 similar comment
@frobware
Copy link
Author

frobware commented Apr 4, 2019

/retest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm Indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.