Skip to content

Conversation

maxcao13
Copy link
Member

@maxcao13 maxcao13 commented Oct 1, 2025

This commit rebases the autoscaler on top of the Kubernetes/Autoscaler 1.34.0 release. There are several commits that we carry on top of the upstream autoscaler and the rebase process allows us to preserve those. Here is a description of the process I used to create this PR.

(inspired by the commit description for the 1.18 rebase. pr #139)

Process

First we need to identify the carry commits that we currently have, this is done against our previous rebase to catch new changes. Once identified we will drop commits which have merged upstream and only carry unique commits. (see below for the carried and dropped commits).

Identify carry commits (run from the openshift/master branch), these are the commits that begin with UPSTREAM: up until the merge commit for the previous rebase commit (merge upstream/cluster-autoscaler-release-1.33 into main)

git log -n 30 --oneline --no-merges

After identifying the carry commits, the next step is to create the new commit-tree that will be used for the rebase and then cherry pick the carry commits into the new branch. The following commands cover these steps:

$ git remote update # make sure we update our refs
$ git checkout cluster-autoscaler-1.34.0
$ git checkout -b merge-tmp # create a temporary branch for our merge commit
$ git checkout openshift/main # we want to be at the tip of the openshift main branch when we run the next command
$ echo 'merge upstream/cluster-autoscaler-1.34.0' | git commit-tree merge-tmp^{tree} -p HEAD -p merge-tmp -F - # create a new merge commit for our history
deadbeef12345678 # id of new merge commit
$ git branch merge-1.34 deadbeef12345678 # create a new branch for the cherry-pick work
$ git checkout merge-1.34
$ git cherry-pick <carry commits> # cherry pick the needed commits into the new branch

With the merge-1.34 branch in place, I cherry picked the carry commits which applied, resolved merge conflicts, and finally tested the resulting tree against the unit test and end-to-end suite.

Carried Commits

These commits are for features which have not yet been accepted upstream, are integral to our CI platform, or are specific to the releases we create for OpenShift.

0abcccedf UPSTREAM: <carry>: Update to prefer upstream annotations if present
8e00fc00f UPSTREAM: <carry>: Fix unstructured taint parsing in Cluster API provider
c3ac09e29 UPSTREAM: <carry>: add machine api label and taint functionality
63602f348 UPSTREAM: <carry>: Have VPA ignore phantom containers named "POD"
4c5e260a7 UPSTREAM: <carry>: Handle old Machine API specific machine delete annotation
e0f1bb728 UPSTREAM: <carry>: Rename FailureMessage to ErrorMessage
f449d81b8 UPSTREAM: <carry>: vendor deps for OpenShift releases
dd1b86512 UPSTREAM: <carry>: configure repository for OpenShift releases

Squashed Commits

These commits were squashed into the carried commits to help reduce the length of our history. All these commits have been squashed into their topically related commits.

9b4150aa0 UPSTREAM: <carry>: Remove OWNERS automation preamble

Dropped Commits
These commits were dropped.

8f15f9147 UPSTREAM: 8396: Fix balancer & CA kwok build / govet errors
4d42a1d4d UPSTREAM: <carry>: revert capacity annotations

Of special note in this rebase is this dropped commit

4d42a1d4d UPSTREAM: <carry>: revert capacity annotations

due to the scale from zero changes being accepted upstream we can now drop our carried patch. but, the upstream implementation has differed slightly from our's (mainly around annotation names). we will need to carry this patch until we can fix all the providers to properly use the new annotations. This patch can be dropped once the epic contained in https://issues.redhat.com/browse/OCPCLOUD-2136 is completed.
Update: This was from one of the previous rebases. The epic is now completed so we are now dropping this carry.

k8s-ci-robot and others added 30 commits June 12, 2025 12:36
Remove redundant warning in the autoscaler Cluster API documentation
fix: add missing 'admission-controller-service' resource to 'hack/vpa-process-yamls.sh print'
…-groups

Add created node group to considered node groups during scale-up
Update deployment.yaml to add volumeattachments permission
Fix(VPA): updater in-place metrics initialization
fix(VPA): Do not update webhook CA when registerWebhook is disabled
Signed-off-by: Yuriy Losev <[email protected]>
[VPA] Use factory start to fill caches instead of separate informers
…t-success

OCI provider: Avoid interpreting HTTP 404 as success on delete
…-cloud-endpoint-reloving

fix bug 8168 GetEndpoint resolving fail
…e-terminate-by-default

feat: cordon node before terminate by default
this change adds debug logs at level 5 to aid in triaging failed node
balancing. It adds logs to help determine why two node groups are not
considered as similar. These logs can be quite noisy so the logging
level has been set to 5 by default.
AEP-7862: Decouple Startup CPU Boost from VPA modes - updates
* add h4d pricing

* fix go fmt

* revert gofmt on other files
cluster-autoscaler: add logging for failed node balancing
./hack/update-deps.sh v1.34.0-alpha.1 v1.34.0-alpha.1 https://github.com/kubernetes/kubernetes.git
hack/update-codegen.sh
BigDarkClown and others added 15 commits September 25, 2025 13:45
Fix capacity buffers injector order in pod list processor
This change carries files and modifications that are used by OpenShift
release infrastructure and related files.

* spec file
* dockerfiles
  * vertical-pod-autoscaler/Dockerfile.rhel
  * vertical-pod-autoscaler/Dockerfile.openshift
  * images/cluster-autoscaler/Dockerfile
  * images/cluster-autoscaler/Dockerfile.rhel
* hack scripts (ci and build related)
* Makefile
* JUnit tools
* update gitignore
* update/remove OWNERS files
* ci-operator config yaml
* remove gitignore file from vertical-pod-autoscaler (allow vendor
  addition)
* add Snyk file to exclude vendor directories and problematic cloud
  providers on scan
Add vendor folders
  * cluster-autoscaler
  * balancer
  * vertical-pod-autoscaler
  * vertical-pod-autoscaler/e2e
…otation

The delete annotation upstream has a different format, but is now
inferred dynamically from the API group. If we update this in MAO to use
the new format, we can drop this old key
This change re-adds the machine api support for labels and taints on
node groups. The code was removed upstream as it is openshift specific,
see this pull request[0].

It also adds in the functionality of the upstream override annotation
for labels and taints[1] to support
https://issues.redhat.com/browse/MIXEDARCH-259

[0]: kubernetes#5249
[1]: kubernetes#5382
…ider

This change corrects the behavior for parsing taints from the
unstructured scalable resource. This is required on OpenShift as our
implementation is slightly different from the upstream.
Also:
* Add unit tests for upstream annotations
* Update unit tests using upstream annotations new values
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Oct 1, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Oct 1, 2025

@maxcao13: This pull request references AUTOSCALE-335 which is a valid jira issue.

In response to this:

This commit rebases the autoscaler on top of the Kubernetes/Autoscaler 1.34.0 release. There are several commits that we carry on top of the upstream autoscaler and the rebase process allows us to preserve those. Here is a description of the process I used to create this PR.

(inspired by the commit description for the 1.18 rebase. pr #139)

Process

First we need to identify the carry commits that we currently have, this is done against our previous rebase to catch new changes. Once identified we will drop commits which have merged upstream and only carry unique commits. (see below for the carried and dropped commits).

Identify carry commits (run from the openshift/master branch), these are the commits that begin with UPSTREAM: up until the merge commit for the previous rebase commit (merge upstream/cluster-autoscaler-release-1.33 into main)

git log -n 30 --oneline --no-merges

After identifying the carry commits, the next step is to create the new commit-tree that will be used for the rebase and then cherry pick the carry commits into the new branch. The following commands cover these steps:

$ git remote update # make sure we update our refs
$ git checkout cluster-autoscaler-1.34.0
$ git checkout -b merge-tmp # create a temporary branch for our merge commit
$ git checkout openshift/main # we want to be at the tip of the openshift main branch when we run the next command
$ echo 'merge upstream/cluster-autoscaler-1.34.0' | git commit-tree merge-tmp^{tree} -p HEAD -p merge-tmp -F - # create a new merge commit for our history
deadbeef12345678 # id of new merge commit
$ git branch merge-1.34 deadbeef12345678 # create a new branch for the cherry-pick work
$ git checkout merge-1.34
$ git cherry-pick <carry commits> # cherry pick the needed commits into the new branch

With the merge-1.34 branch in place, I cherry picked the carry commits which applied, resolved merge conflicts, and finally tested the resulting tree against the unit test and end-to-end suite.

Carried Commits

These commits are for features which have not yet been accepted upstream, are integral to our CI platform, or are specific to the releases we create for OpenShift.

0abcccedf UPSTREAM: <carry>: Update to prefer upstream annotations if present
8e00fc00f UPSTREAM: <carry>: Fix unstructured taint parsing in Cluster API provider
c3ac09e29 UPSTREAM: <carry>: add machine api label and taint functionality
63602f348 UPSTREAM: <carry>: Have VPA ignore phantom containers named "POD"
4c5e260a7 UPSTREAM: <carry>: Handle old Machine API specific machine delete annotation
e0f1bb728 UPSTREAM: <carry>: Rename FailureMessage to ErrorMessage
f449d81b8 UPSTREAM: <carry>: vendor deps for OpenShift releases
dd1b86512 UPSTREAM: <carry>: configure repository for OpenShift releases

Squashed Commits

These commits were squashed into the carried commits to help reduce the length of our history. All these commits have been squashed into their topically related commits.

9b4150aa0 UPSTREAM: <carry>: Remove OWNERS automation preamble

Dropped Commits
These commits were dropped.

8f15f9147 UPSTREAM: 8396: Fix balancer & CA kwok build / govet errors
4d42a1d4d UPSTREAM: <carry>: revert capacity annotations

Of special note in this rebase is this dropped commit

4d42a1d4d UPSTREAM: <carry>: revert capacity annotations

due to the scale from zero changes being accepted upstream we can now drop our carried patch. but, the upstream implementation has differed slightly from our's (mainly around annotation names). we will need to carry this patch until we can fix all the providers to properly use the new annotations. This patch can be dropped once the epic contained in https://issues.redhat.com/browse/OCPCLOUD-2136 is completed.
Update: This was from one of the previous rebases. The epic is now completed so we are now dropping this carry.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested review from jkyros and joelsmith October 1, 2025 07:42
Copy link

openshift-ci bot commented Oct 1, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign jkyros for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the do-not-merge/invalid-owners-file Indicates that a PR should not merge because it has an invalid OWNERS file in it. label Oct 1, 2025
Copy link

openshift-ci bot commented Oct 1, 2025

The following users are mentioned in OWNERS file(s) but are untrusted for the following reasons. One way to make the user trusted is to add them as members of the openshift org. You can then trigger verification by writing /verify-owners in a comment.

  • sig-autoscaling-vpa-reviewers
    • User is not a member of the org. User is not a collaborator. Satisfy at least one of these conditions to make the user trusted.
  • sig-autoscaling-leads
    • User is not a member of the org. User is not a collaborator. Satisfy at least one of these conditions to make the user trusted.

Copy link

openshift-ci bot commented Oct 1, 2025

@maxcao13: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-periodic-pre 09260d1 link false /test e2e-aws-periodic-pre
ci/prow/okd-scos-e2e-aws-ovn 09260d1 link false /test okd-scos-e2e-aws-ovn
ci/prow/govet 09260d1 link true /test govet
ci/prow/e2e-azure-operator 09260d1 link false /test e2e-azure-operator
ci/prow/images 09260d1 link true /test images
ci/prow/okd-scos-images 09260d1 link true /test okd-scos-images
ci/prow/e2e-hypershift 09260d1 link true /test e2e-hypershift
ci/prow/e2e-gcp-operator 09260d1 link false /test e2e-gcp-operator
ci/prow/git-history 09260d1 link false /test git-history
ci/prow/e2e-aws 09260d1 link true /test e2e-aws
ci/prow/security 09260d1 link true /test security
ci/prow/unit 09260d1 link true /test unit
ci/prow/e2e-aws-operator 09260d1 link true /test e2e-aws-operator

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@maxcao13
Copy link
Member Author

maxcao13 commented Oct 1, 2025

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 1, 2025
@maxcao13 maxcao13 changed the title AUTOSCALE-335: AUTOSCALE-336: rebase on upstream 1.34.0 release WIP: AUTOSCALE-335: AUTOSCALE-336: rebase on upstream 1.34.0 release Oct 1, 2025
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 1, 2025
@maxcao13
Copy link
Member Author

maxcao13 commented Oct 2, 2025

Closing in favour of #386

@maxcao13 maxcao13 closed this Oct 2, 2025
@maxcao13 maxcao13 deleted the merge-1.34 branch October 2, 2025 21:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. do-not-merge/invalid-owners-file Indicates that a PR should not merge because it has an invalid OWNERS file in it. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.
Projects
None yet
Development

Successfully merging this pull request may close these issues.