Skip to content

Conversation

joelsmith
Copy link

@joelsmith joelsmith commented Oct 2, 2025

Started out with this command:

rebasebot --source https://github.com/kubernetes/autoscaler:cluster-autoscaler-release-1.34 \
          --dest openshift/kubernetes-autoscaler:main \
          --rebase joelsmith/autoscaler:rebase-bot-main \
          --tag-policy=strict \
          --github-user-token <(yaml2json ~/.config/hub | jq -r '."github.com"[0].oauth_token') \
          --dry-run

After it was done, I removed the cherry-picks and manually cherry-picked the set of patches it identified so that I could manually resolve merge conflicts.

I squashed "Remove OWNERS automation preamble" into "configure repository for OpenShift releases"

Most of the cherry-picks required minimal or no changes, but "Fix unstructured taint parsing in Cluster API provider" required substantial changes due to the upstream PR kubernetes#8536 which refactored a lot of the cluster API provider's test framework.

vflaux and others added 30 commits June 20, 2025 16:04
fix(VPA): Do not update webhook CA when registerWebhook is disabled
Signed-off-by: Yuriy Losev <[email protected]>
[VPA] Use factory start to fill caches instead of separate informers
…t-success

OCI provider: Avoid interpreting HTTP 404 as success on delete
…-cloud-endpoint-reloving

fix bug 8168 GetEndpoint resolving fail
…e-terminate-by-default

feat: cordon node before terminate by default
this change adds debug logs at level 5 to aid in triaging failed node
balancing. It adds logs to help determine why two node groups are not
considered as similar. These logs can be quite noisy so the logging
level has been set to 5 by default.
AEP-7862: Decouple Startup CPU Boost from VPA modes - updates
* add h4d pricing

* fix go fmt

* revert gofmt on other files
cluster-autoscaler: add logging for failed node balancing
./hack/update-deps.sh v1.34.0-alpha.1 v1.34.0-alpha.1 https://github.com/kubernetes/kubernetes.git
hack/update-codegen.sh
As discussed in sig-autoscaling meeting on 2025-06-30, this
is to try follow a similar pattern to the KEP process by getting a
tech lead's buy in before merging an AEP.
…s-approvers-for-aeps

Give sig-autoscaling-leads approval of the AEP directory
…ode-groups-from-balancing

Filter out non-existing node-groups before scale-up balancing
k8s-ci-robot and others added 10 commits September 26, 2025 06:12
Fix capacity buffers injector order in pod list processor
…test-in-docker`

`make test-in-docker` was changed to disable the printf analyzer, but
`make test-unit` wasn't for some reason. The current master isn't
compatible with the printf analyzer, so `make test-unit` fails on master
without this change.
…erry-pick-8552-to-cluster-autoscaler-release-1.34

[cluster-autoscaler-release-1.34] Allow atomic scale down of partially healthy node groups
TestNodeLoadFromExistingTaints creates a currentTime variable set to time.Now(),
and a bunch of test objects with time values offset from that variable. This is
all standard practice, but then the test iterates over test cases, calls t.Parallel(),
and overwrites currentTime with time.Now() again. This makes go test -race fail,
because multiple goroutines are writing currentTime at once. It also
doesn't seem to make sense in the context of the test, because the other
test objects are still offset from the original value.

Removing the second write to currentTime seems to be the correct fix here. Also
renamed one import because it collided with a local variable name used throughout
this test file.
…erry-pick-8584-to-cluster-autoscaler-release-1.34

[cluster-autoscaler-release-1.34] Change `make test-unit` to have the same go test parameters as `make test-in-docker`
…erry-pick-8588-to-cluster-autoscaler-release-1.34

[cluster-autoscaler-release-1.34] Fix a race condition in TestNodeLoadFromExistingTaints
The DRA scheduler plugin is enabled by default since 1.34. We have to
hack it to be disabled if the CA DRA logic is disabled via the flag.
Without this, the DRA scheduler plugin is enabled but not set up
properly, and panics.
…erry-pick-8598-to-cluster-autoscaler-release-1.34

[cluster-autoscaler-release-1.34] Fix DRA enablement logic
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Oct 2, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Oct 2, 2025

@joelsmith: This pull request references AUTOSCALE-335 which is a valid jira issue.

In response to this:

Started out with this command:

rebasebot --source https://github.com/kubernetes/autoscaler:cluster-autoscaler-release-1.34 \
         --dest openshift/kubernetes-autoscaler:main \
         --rebase joelsmith/autoscaler:rebase-bot-main \
         --tag-policy=strict \
         --github-user-token <(yaml2json ~/.config/hub | jq -r '."github.com"[0].oauth_token') \
         --dry-run

After it was done, I removed the cherry-picks and manually cherry-picked the set of patches it identified so that I could manually resolve merge conflicts.

I squashed "Remove OWNERS automation preamble" into "configure repository for OpenShift releases"

Most of the cherry-picks required minimal or no changes, but "Fix unstructured taint parsing in Cluster API provider" required substantial changes due to the upstream PR kubernetes#8536 which refactored a lot of the cluster API test framework.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested review from elmiko and maxcao13 October 2, 2025 21:38
Copy link

openshift-ci bot commented Oct 2, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign maxcao13 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

joelsmith and others added 7 commits October 2, 2025 15:46
This change carries files and modifications that are used by OpenShift
release infrastructure and related files.

* spec file
* dockerfiles
  * vertical-pod-autoscaler/Dockerfile.rhel
  * vertical-pod-autoscaler/Dockerfile.openshift
  * images/cluster-autoscaler/Dockerfile
  * images/cluster-autoscaler/Dockerfile.rhel
* hack scripts (ci and build related)
* Makefile
* JUnit tools
* update gitignore
* update/remove OWNERS files
* ci-operator config yaml
* remove gitignore file from vertical-pod-autoscaler (allow vendor
  addition)
* add Snyk file to exclude vendor directories and problematic cloud
  providers on scan
Add vendor folders
  * cluster-autoscaler
  * balancer
  * vertical-pod-autoscaler
  * vertical-pod-autoscaler/e2e

for i in cluster-autoscaler balancer vertical-pod-autoscaler vertical-pod-autoscaler/e2e; do pushd $i; go mod tidy; go mod vendor; popd; done
…otation

The delete annotation upstream has a different format, but is now
inferred dynamically from the API group. If we update this in MAO to use
the new format, we can drop this old key
This change re-adds the machine api support for labels and taints on
node groups. The code was removed upstream as it is openshift specific,
see this pull request[0].

It also adds in the functionality of the upstream override annotation
for labels and taints[1] to support
https://issues.redhat.com/browse/MIXEDARCH-259

[0]: kubernetes#5249
[1]: kubernetes#5382
the upstream annotations for the scale from zero capacity resources is
slighty different than the openshift implementation. the largest
difference is the addition of a gpu type annotation. openshift does not
yet utilize this annotation and thus this patch should be carried until
the machineset controllers for the various providers on openshift have
been modified to use the new annotations.

another important change is the modification of the memory annotation.
previously in openshift we expected this value to be a count of memory
in Mebibytes. the conversion function and tests have been modified to
allow continued openshift operation.

this change can be dropped when the annotations in openshift have been
updated, the progress for this effort can be followed at
https://issues.redhat.com/browse/OCPCLOUD-944
@openshift-ci-robot
Copy link

openshift-ci-robot commented Oct 2, 2025

@joelsmith: This pull request references AUTOSCALE-335 which is a valid jira issue.

In response to this:

Started out with this command:

rebasebot --source https://github.com/kubernetes/autoscaler:cluster-autoscaler-release-1.34 \
         --dest openshift/kubernetes-autoscaler:main \
         --rebase joelsmith/autoscaler:rebase-bot-main \
         --tag-policy=strict \
         --github-user-token <(yaml2json ~/.config/hub | jq -r '."github.com"[0].oauth_token') \
         --dry-run

After it was done, I removed the cherry-picks and manually cherry-picked the set of patches it identified so that I could manually resolve merge conflicts.

I squashed "Remove OWNERS automation preamble" into "configure repository for OpenShift releases"

Most of the cherry-picks required minimal or no changes, but "Fix unstructured taint parsing in Cluster API provider" required substantial changes due to the upstream PR kubernetes#8536 which refactored a lot of the cluster API provider's test framework.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

JoelSpeed and others added 2 commits October 2, 2025 23:11
…ider

This change corrects the behavior for parsing taints from the
unstructured scalable resource. This is required on OpenShift as our
implementation is slightly different from the upstream.
Also:
* Add unit tests for upstream annotations
* Update unit tests using upstream annotations new values
@joelsmith joelsmith changed the title AUTOSCALE-335: AUTOSCALE-336: 1.34.0 upstream rebase AUTOSCALE-335,AUTOSCALE-336: 1.34.0 upstream rebase Oct 3, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Oct 3, 2025

@joelsmith: This pull request references AUTOSCALE-335 which is a valid jira issue.

This pull request references AUTOSCALE-336 which is a valid jira issue.

In response to this:

Started out with this command:

rebasebot --source https://github.com/kubernetes/autoscaler:cluster-autoscaler-release-1.34 \
         --dest openshift/kubernetes-autoscaler:main \
         --rebase joelsmith/autoscaler:rebase-bot-main \
         --tag-policy=strict \
         --github-user-token <(yaml2json ~/.config/hub | jq -r '."github.com"[0].oauth_token') \
         --dry-run

After it was done, I removed the cherry-picks and manually cherry-picked the set of patches it identified so that I could manually resolve merge conflicts.

I squashed "Remove OWNERS automation preamble" into "configure repository for OpenShift releases"

Most of the cherry-picks required minimal or no changes, but "Fix unstructured taint parsing in Cluster API provider" required substantial changes due to the upstream PR kubernetes#8536 which refactored a lot of the cluster API provider's test framework.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link

openshift-ci bot commented Oct 3, 2025

@joelsmith: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/okd-scos-e2e-aws-ovn d67a63f link false /test okd-scos-e2e-aws-ovn
ci/prow/unit d67a63f link true /test unit

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.
Projects
None yet
Development

Successfully merging this pull request may close these issues.