Skip to content

Conversation

upodroid
Copy link
Member

@upodroid upodroid commented Oct 2, 2025

  • Bump addon manager image to v9.1.8
  • protect against race between deletion and adding finalizers
  • chore(kubelet): migrate pluginmanager to contextual logging
  • kubectl/logs: Add LogOptions.RunLogsContext
  • chore(kubelet): migrate config to contextual logging.
  • chore(kubelet): migrate prober to contextual logging.
  • Replace deprecated WaitForServiceEndpointsNum
  • ServiceCIDR ValidationAdmissionPolicy for backkwards compatible behavior
  • Replace usage of deprecated ErrWaitTimeout with recommended method across all Pkgs
  • Add jefftree to OWNERS
  • Add newline to fix owners fmt
  • Move kubelet config code to kubeletconfig
  • Move ContainerRuntimeOptions flags to cmd/kubelet/app/options
  • PSI test: add a CPU limit of 500m to cpu-stress-pod
  • adds a list of available HTTP endpoints for the kube-controller-manager component under the /statusz page
  • Remove rbd image and storage class
  • kubectl: include container fieldPath in event messages
  • feat: Add discovery check to SVM to ensure migration doesn't get stuck
  • Add doc.go and ARCHITECTURE.md to client-go
  • update gofmt
  • Apply feedback
  • kube-proxy: list available endpoints in /statusz
  • util/sets: simply List() by using slices.Sort
  • util/sets: benchmark List()
  • client-go leader-election: structured, contextual logging
  • kubelet: fix error message for EnableNodeLogQuery
  • Update MAP storage version to use v1beta1.
  • update openapi spec
  • Fix the spelling error of grpc in the log
  • Add HirazawaUi as a reviewer for sig-node
  • Evict terminated pods on disk pressure
  • Test terminated pods are evicted on disk pressure
  • TerminatedPodsEvictionOnDiskPressure e2e node test
  • Add ImageGCTerminatedPodsEviction e2e node test
  • Remove terminated pods eviction code
  • ImageGCTerminatedPodsContainersCleanup e2e node test
  • Remove sleepAfterExecuting param from diskConsumingPod
  • separate resource-quota and limit-ranger resize tests
  • Add unit tests to isResourceUpdatable
  • deflake e2e test: Services should implement NodePort and HealthCheckNodePort correctly when ExternalTrafficPolicy changes
  • remove v1beta3 flowcontrol from rest storage
  • refactor(event): simplify conditional logic in event handling for both v1 and eventsv1 APIs
  • return an error in case nil selectors are passed to matcher functions
  • Update pod resize test to accept new cpu.weight conversion.
  • Replace deprecated strings.Title with cases.Title
  • Update volume/iscsi base image from fedora:38 to fedora:42
  • ci: remove httpd usage while using agnhost instead
  • add paths section to kubelet statusz endpoint
  • Move interfaces: Handle and Plugin and related types from kubernetes/kubernetes to staging repo kube-scheduler
  • Fix race in movePodsToActiveOrBackoffQueue
  • - Update Nautilus test agnhost images from 2.33 to 2.56 - Update VERSION to 1.8 - Addresses issue Update the BASEIMAGEs to the latest version for test/images kubernetes/kubernetes#131874 for updating outdated base images
  • fix kubectl exec command in cmd test
  • Omit value type from validation rule failures
  • apply integration test: fix ordering test flake
  • Disable estimating resource size for resources with watch cache disabled
  • Add Kubelet stress test for pod cleanup when rejection due to VolumeAttachmentLimitExceeded (Add Kubelet stress test for pod cleanup when rejection due to VolumeAttachmentLimitExceeded kubernetes/kubernetes#133357)
  • Enforce that all resources set resourcePrefix
  • Move actuated resources state to kuberuntime.Manager
  • hpa: prevent integer overflow in external metrics sum
  • Lock AllowOverwriteTerminationGracePeriodSeconds
  • kubelet/metrics: misc optimization
  • kubelet/metrics: fix multiple Register call
  • Switch to resourceVersion controller
  • Revert trapping TERM for podWithCommand
  • Resolve confusing use of TooManyRequests error for eviction (Resolve confusing use of TooManyRequests error for eviction kubernetes/kubernetes#133097)
  • Update client-go compatibility matrix to include releases up to 1.34
  • fix: Only warn for unrecognized formats on type=string
  • fix typo for sattsfied
  • Add k8s-long-name, k8s-short-name format validation tags
  • Fix incorrect description of feature PodObservedGenerationTracking
  • kubelet: migrate utils to contextual logging
  • kubelet: migrate module logs to contextual logging
  • Remove redundant experimental prefix in wait command
  • chore(kubelet): migrate watchdog to contextual logging
  • chore(kubelet): migrate container to contextual logging
  • Change WaitForNamedCacheSync to WaitForNamedCacheSyncWithContext.
  • Apply feedback
  • Update SVM Discovery checks in response to jpbetz and stlaz
  • Close container runtime connections after use
  • [client-go] [cli-runtime] [133916]: handle properly config override logic when override provides ClientKey, ClientCertificate
  • fix typo for forceDetachTimeoutExpired
  • chore(kubelet): migrate metrics to contextual logging.
  • [client-go] [cli-runtime] [133916]: handle properly config override logic when override provides ClientKey, ClientCertificate: use values from overrides when one of the field (file or data) is present in inverrides
  • DRA kubelet: avoid deadlock when gRPC connection to driver goes idle
  • add paths section to scheduler statusz endpoint
  • Remove getLocalNode to fix GracefulNodeShutdown e2e.
  • chore(kubelet): migrate stats to contextual logging
  • chore: Clean up duplicate logs
  • scheduler_perf: reset and stop testing.B metrics
  • scheduler_perf: block after creating ResourceSlices
  • Populate memory requests from actuated resources at pod status generation time
  • Validate kubelet serving cert in local-up-cluster
  • Revert "Add retries to node's crictl test"
  • fix CI failure: update pod image using the same one
  • Fix ClusterIP load balancer disappearing when InternalTrafficPolicy: Local is set.
  • e2e_node kubelet configuration: merge feature gates and system-reserved items
  • Migrate kubelet/server to contextual logging
  • Add v1.34.0 API testdata
  • delete v1.32.0 testdata
  • scheduler_perf: measure DRA setup time
  • fix race condition in kubelet's PodCertsManager
  • Put the nfacct e2e test back under the "KubeProxy" label
  • Add comments to generated code
  • Minor validator name-string fix
  • Update Context comments and fix some usage
  • validation: Use JSON names in paths
  • run "hack/update-codegen.sh valid"
  • Unions: replace [2]string with a type
  • Add ListSelector in validation Context
  • Break processFieldMemberValidations into 2 funcs
  • Rename "fields" to "members"
  • Improve error reporting in item tag
  • Parse path early, clean up getDisplayFields()
  • Clarify that union has field- or item-members
  • Sort item criteria to match listmap key order
  • Make item validation just use a TagValidator
  • Refactor ItemTagValidator.GetValidations a bit
  • update prometheus' client_golang and common packages
  • switch our usage of expfmt.TextParser
  • Revert "protect against race between deletion and adding finalizers"
  • Emit ratchet check for fields with a type func
  • Don't ratchet-check inside type functions
  • Pass equiv func to subfield, like item and eachVal
  • Temporary: Re-enable listmap uniqueness checks
  • Re-disable listmap uniqueness (for now)
  • DRA: Fix PrioritizedList scheduler perf test
  • refactor(validation-gen): move list-related validators to list.go
  • feat(validation-gen): support unique tag on list
  • Add tests for unique tag combo & update-codegen
  • Fix negative pod startup duration
  • Fix fake runtime's image pull
  • CHANGELOG: Update directory for v1.32.9 release
  • CHANGELOG: Update directory for v1.33.5 release
  • CHANGELOG: Update directory for v1.34.1 release
  • CHANGELOG: Update directory for v1.31.13 release
  • DRA: Fix ConsumableCapacity shceduler perf test (simplified)
  • Cleanup enabling resource size estimate
  • migrate kubelet/certificate to contextual logging
  • Fix minor inconsistencies in scheduler
  • scheduler_perf: detect testcases with no pods scheduled
  • DRA scheduler_perf: clean up usage of steady-state pod scheduling
  • Use increaseRV in TestWatchStreamSeparation to imply external RV increase
  • Fix tests to to only accesses keys from under resourcePrefix
  • fixed bug such that implicit extended resource name can always be used, no matter the explicit extendedResourceName field in device class is set or not.
  • kcm/app: Add proper goroutine management
  • fix intergation test
  • Add support for UUID format.
  • Fix flaky resource claim metrics test
  • Bump kube-openapi
  • Enable openapi model name accessor generator
  • Add model name generator tags
  • generate
  • Update violation exceptions
  • stop using util.ToRESTFriendlyName in favor of declared model names
  • Add tests
  • Update tests that depend on internal model names
  • Update sample-apiserver and examples
  • applyconfiguration-gen: preserve struct and field comments in generated code
  • applyconfiguration-gen: remove "Experimental!" comment as the code has been stable for several releases
  • ./hack/update-codegen.sh
  • Replace WaitForNamedCacheSync with WaitForNamedCacheSyncWithContext in pkg/controller/garbagecollector
  • kmsv2: run TestKMSv2ProviderKeyIDStaleness in parallel
  • add fake-registry-server command to agnhost
  • fix lint errors
  • Add ratcheting of selectableFields
  • deflake unit test: TestIsConnectionReset
  • kubeadm: fix the KUBEADM_UPGRADE_DRYRUN_DIR environment variable doesn't work forthe upgrade phase when it writes kubelet config files to disk
  • replace fmt.Printf with fmt.Fprintf
  • Fix version bump to follow semantic versioning
  • Fix tests not using proper resource paths
  • scheduler_perf: KUBE_CACHE_MUTATION_DETECTOR=false in docs
  • Disable too short scheduler_perf workloads
  • deflake e2e tests: set cpu requests to avoid out of cpu
  • standardize not found error message of kubectl scale
  • sort the device requests in the extended resource claim spec. removed the sortClaim in the unit test.
  • skip creating storages for unserved versions
  • scheduler_perf: run garbage collection before measurement
  • Drop PodIndexLabel after the feature GA-ed in 1.32
  • Replace NewIndexerInformerWatcher with NewIndexerInformerWatcherWithLogger
  • Add additional test for root level, ignore mutation lint error
  • [client-go] [cli-runtime] [133916]: handle properly config override logic when override provides ClientKey, ClientCertificate: also empty TokenFile if Token is set in ConfigFlags
  • Specify the deprecated version of apiserver_storage_objects metric
  • Replace deprecated sets.String with sets.Set for Index type
  • node_e2e: fix kubelet configuration setup
  • Fix flaking RunTestDelayedWatchDelivery
  • Fix cacher resource prefix not having a "/" at the end in tests
  • Explicitly set TerminationGracePeriodSeconds for mirror pod
  • build: automatically choose a suitable base image
  • Remove container name from container event messages
  • fix: use iifname for input interface name matches
  • DRA E2E node: fix test cleanup
  • Add support for k8s-label-value format.
  • Add support for k8s-label-key
  • Update KAL to latest and add shadow config for new options
  • Enable conditions linter for Kube API Linter
  • Add exceptions for existing issues for conditions linter
  • refactor(controller): Use context-aware WaitForNamedCacheSync in resourcequota and HPA tests
  • Rename CLE test directories
  • Update agnhost to version 2.57
  • Improve dry-run error messages for clarity
  • test/e2e/node: add [NodeConformance] label to ConfigMap update test
  • Ensure consistent key schema requirements between cacher and etcd3
  • Update cmd/kubeadm/app/cmd/upgrade/node.go
  • Wait for quota to report used before creating pvc
  • feat(validation): enhance slice validation with declarative options
  • Add fine grained metrics to narrow down DV mismatches and panics
  • test/e2e/node: promote ConfigMap update test to Conformance
  • fix gofmt
  • Add reviewers and approvers to api/testing
  • print the current kubectl command encapsulated by kuberc on V(1)
  • Unify directory protection for recursive requests in storage
  • Add helpers for declarative validation tests
  • fix incorrect warning whenever headless service is created/updated
  • Address tests grouping comment
  • feat(apiextensions-apiserver): Add WithContext variant to EstablishingController
  • Update documented metrics list
  • Update pkg/api/testing/OWNERS
  • Bump distroless-iptables to v0.7.8
  • feat(validation-gen): Add declarative validation support for ResourceClaim/(v1,v1beta1,v1beta2)
  • refactor: simplify declarative validation tests for ResourceClaim
  • fix(tests): update fake client initialization and add resource version handling in validation tests
  • chore(validation): add validation identifier for declarative validation in ResourceClaim
  • Replace WaitForNamedCacheSync with WaitForNamedCacheSyncWithContext in pkg/controller/
  • Bump to go1.25.1 based images
  • Update kitten base image from agnhost:2.33 to agnhost:2.57 (Update kitten base image from agnhost:2.33 to agnhost:2.57 kubernetes/kubernetes#133818)
  • Replace HandleError with HandleErrorWithContext
  • refactor(controller): Use WithContext variants in cloud node controllers
  • Adding metrics for Maxunavailable feature in StatefulSet (Adding metrics for Maxunavailable feature in StatefulSet kubernetes/kubernetes#130951)
  • kubeadm: graduate ControlPlaneKubeletLocalMode to GA
  • Test requests send to etcd for all LIST requests
  • Extract the prepareKey function
  • Ensure keys used in storage and cacher start with resourcePrefix
  • e2e_node: test DRA plugin gRPC connection reuse
  • feat(validation-gen) enable declarative validation for resource.k8s.io DeviceClass
  • Add support for k8s-long-name-caseless format.
  • Plumb effective version into admission initializer
  • Delete temporary ProbeHostPodSecurityStandards feature gate
  • Make pod-security-admission honor emulation version
  • add go.work to dep-approvers file list
  • add go.work.sum to dep-approvers file list
  • refactor: skip re-validating for unchanged resource claim specs
  • bump go language version to 1.25
  • drop automaxprocs hacks now that go 1.25 handles this built in
  • Drop utiliptables.NewDualStack()
  • bump system-validators to v1.11.1
  • fix: duplicated 'the' in comment
  • Remove KUBECTL_OPENAPIV3_PATCH feature gate as the feature is stable
  • Add e2e test for MaxUnavailable StatefulSet RollingUpdate (Add e2e test for MaxUnavailable StatefulSet RollingUpdate kubernetes/kubernetes#133717)
  • Bump golangci-lint to 2.4.0
  • kubeadm: cleanup after ControlPlaneKubeletLocalMode
  • refactor: Use WaitForNamedCacheSyncWithContext in core components
  • refactor(cloud-provider): Use WaitForNamedCacheSyncWithContext
  • update to latest sigs.k8s.io/json
  • Don't limit the number of goroutines dispatched by the API Dispatcher
  • Wait the readiness of pods for all the containers generate logs
  • Enable SSATags linter to enforce +listType on lists in APIs
  • expand coverage for resource quota and limit ranger tests
  • DRA E2E node: fix cleanup of tests using separate registrar
  • Remove RootlessControlPlane feature gate
  • remove unused file
  • Replace deprecated WaitForServiceEndpointsNum call with WaitForEndpointCount
  • Remove unused WaitForServiceEndpointsNum function
  • Track connection using IP+port in server to fix conntrack test flakes
  • Bump image version
  • Promote regression-issue-74839 to 1.4
  • Add hpa object count metric (Add hpa object count metric kubernetes/kubernetes#134140)
  • Use context.Background() directly in kubeadm polling API calls
  • [126379] [go-client] chore: use WithContext functions
  • bump gengo
  • Remove some unused bits of verify-golangci-lint.sh
  • update kube-cross image
  • Fix SELinux e2e tests waiting for "container created" event
  • Drop unnecessary gogo dependencies
  • Remove non-generated use of gogo dependencies
  • Clean up gogo dependency tracking
  • Revert "Merge pull request fix: handle corner cases in the async preemption kubernetes/kubernetes#133213 from sanposhiho/second-trial-conor"
  • Improve BenchmarkSerializeObject benchmark
  • Add utility function to errors to allow format composition
  • Add k8s-long-name-segments format
  • Add output tests
  • Add declarative validation of ResourceClaim status pool field
  • Add declarative validation tests for ResourceClaim status
  • generate
  • [126379] [go-client] chore: use WithContext functions: do not use SleepWithContext inside Sleep, use CalculateBackoff inside CalculateBackoffWithContext
  • added unit test for /statusz endpoints
  • refactor(apiextensions-apiserver): Make NamingConditionController fully context-aware
  • refactor(apiextensions-apiserver): Make NonStructuralSchema controller context-aware
  • Add RunWithContext method for debugsocket
  • Make APIServerLeaseGC controller context-aware
  • update publishing rules for 1.33/1.34 to set go1.24.7
  • kubeadm: ensure waiting for apiserver uses a local client
  • Add nil scheme check in GetReference
  • make containerd download more robust
  • feat(validation-gen): Add k8s:customUnique tag for disabling uniqueness validation
  • chore(validation-gen): Update output_tests for k8s:customUnique
  • feat(certificates): Add k8s:customUnique tag to CertificateSigningRequestStatus
  • fix: add +enum tag to resource DeviceAllocatoionMode
  • Fix error messages in volume path handler
  • kubeadm: use JoinHostPort in WaitControlPlaneClient
  • update autoscaling leads
  • Improve tests devex for DV tests.
  • Enable listmap uniqueness & run codegen
  • test(certificates): Add ratcheting test for CSR conditions
  • test(validation-gen): Enable uniqueness validation tests for listmap
  • emit comment for uniqueness is disabled by k8s:customUnique
  • Apply feedback
  • Introduce API to codify and validate feature gate dependencies
  • refactor(DRA validation): Add granular controls to ValidateCSIDriverName for declarative validation migration
  • Remove configmaps related rules from the kube-controller-manager and kube-scheduler leader election roles
  • Add configurable tolerance e2e test.
  • Refactor: Centralize declarative validation and migration logic
  • test: Add unit tests for metricIdentifier function
  • simplify scale subresource testing and document expectations
  • refactor: Remove Validate(Update)Declaratively and improve error handling
  • Remove unused WithTakeover and WithValidationIdentifier
  • chore: Move declarative validation featuregates to staging apiserver
  • Add desired_replicas histogram metric to HPA controller
  • Deprecate caseless driver name validation and enforce lowercase warnings
  • Update NPD to v1.34.0
  • nodelifecycle: fix ComputeZoneState method comment
  • disruption: remove unused pdb parameter from getExpectedScale method
  • gce: fix etcd manifest
  • test/e2e/apimachinery/watchlist: select only wellknown secrets for table test
  • Add maxItems limits to ResourceClaim
  • Add declarative validation tests, use tweak pattern, and additional test structure changes
  • kubeadm: rework the FetchInitConfigurationFromCluster node flags
  • add +k8s:maxItems tag logic and tests
  • generate
  • fix validation_resourceclaim_test.go with MarkCoveredByDeclarative
  • fix resource claims deallocation for extended resource when pod is completed
  • add getters for event User and ImpersonatedUser on AuditContext
  • feat: Add helper function for client-go to compare resource version
  • feat: Add conformance tests for all resources for comparable resource version
  • Make legacytokentracking controller context aware
  • DRA scheduler: clean up feature gate handling
  • DRA scheduler: fix selection of "incubating" allocator implementation
  • DRA scheduler: add unit test for allocator selection
  • scheduler_perf: apply feature gates in deterministic, alphabetical order
  • DRA scheduler_perf: run with specific allocator implementations
  • Add additional types for resource version comparison testing
  • fix(cordonhelper): Avoid mutating local node before API call
  • DRA ResourceSlice tracker: explain test a bit better, fix -run
  • feat: Add matcher and conformance tests ensuring that RV is uint128
  • Document 0 as a special case in RV comparison
  • Fix field path for embedded fields in root types
  • Simplify tests wrt ratcheting
  • Make ErrorMatcher more strict about multi-match
  • Remove ExpectRegexpsByPath()
  • Remove ExpectInvalid()
  • Add comments
  • improve httpstream handshake error logging
  • Move metrics calculations to getList
  • Fix passing runtime.Object to HaveValidResourceVersion check
  • DRA ResourceSlice: nicer log output
  • fix: Adjust validation for pool names to ensure proper coverage in device requests
  • Fix ReplicationControl double validation
  • Prefactor: Fix some bad tests
  • Revert "Omit type names of emitted slice elements to appease gofmt"
  • Add support for validation cohorts
  • Add cohort support to +k8s:subfield
  • Add cohort support to +k8s:item
  • Add ValidateObjectMetaWithOpts() to apimachinery
  • Validate ReplicationController.metadata.name
  • Eliminate public ValidateReplicationControllerName
  • Update CSR DV test to match RC style
  • run update-codegen to for ReplicationController
  • fix: Update error origin in ValidateDNS1123Label to use k8s-short-name format
  • fix: Comment out ipSloppyValidator
  • fix typo in comment for namespace validation to appease verify-spelling
  • check if master image project is set

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR is related to:

Special notes for your reviewer:

Does this PR introduce a user-facing change?


Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


aditigupta96 and others added 30 commits September 16, 2025 14:51
refactor(DRA validation): Add granular controls for declarative validation migration
fix: use iifname for input interface name matches
…aks-in-kubelet-podcertsmanager

fix race condition in kubelet's PodCertsManager
fix incorrect warning whenever headless service is created/updated
Add reviewers and approvers to api/testing
…edcachesync-with-context

Replace WaitForNamedCacheSync with WaitForNamedCacheSyncWithContext in pkg/controller/
…rnamedcachesync

refactor(controller): Use context-aware WaitForNamedCacheSync in resourcequota and HPA tests
Ensure consistent key schema requirements between cacher and etcd3
Fix the spelling error of grpc in the log
…s#133818)

* Update kitten base image from agnhost:2.33 to agnhost:2.56

- Update BASEIMAGE from agnhost:2.33 to agnhost:2.56 for all platforms
- Bump VERSION from 1.7 to 1.8
- Addresses issue kubernetes#131874 for updating outdated base images

agnhost:2.56 is the latest version providing updated functionality
and security fixes for the kitten test image.

* Update kitten base image from agnhost:2.56 to agnhost:2.57

- Update BASEIMAGE to use latest agnhost:2.57 across all platforms
This change refactors the cloud-specific versions of the node lifecycle
and node IPAM controllers to use a context.Context for cancellation and
contextual logging, replacing the legacy stopCh pattern.

This is a follow-up to PR kubernetes#133985, where these controllers were
separated out due to their use in the legacy Cloud Controller Manager
(CCM).

It is a known issue that the CCM's startup logic does not pass the
controller name via the context. This change proceeds with the
refactoring to unify the cancellation logic across controllers, while
acknowledging that contextual logs will be less detailed when these
controllers are run in the CCM.

Signed-off-by: Aditi Gupta <[email protected]>
…130951)

* adding maxunavailable_violation metric

added metric to list of stable metrics

changed when metric gets incremented

addressed comments

fixed stable metrics list

* Update pkg/controller/statefulset/metrics/metrics.go

Co-authored-by: Filip Křepinský <[email protected]>

* Update the metric and log verbosity level

* Address false positives metric count

Signed-off-by: Heba Elayoty <[email protected]>

* Implement maxUnavailable and UnavailableReplicas metrics

Signed-off-by: Heba Elayoty <[email protected]>

* fix lint fmt

Signed-off-by: Heba Elayoty <[email protected]>

* update tests

Signed-off-by: Heba Elayoty <[email protected]>

* se metrics to 1 as a default

* log for true validation only and update func sig.

* Move maxUnavailable metric to the updateStatefulSetStatus

Signed-off-by: Heba Elayoty <[email protected]>

* change metrics stability level to Alpha

Signed-off-by: Heba Elayoty <[email protected]>

* fix unit test

Signed-off-by: Heba Elayoty <[email protected]>

* fix linting issue

Signed-off-by: Heba Elayoty <[email protected]>

* Address code review feedback

Signed-off-by: Heba Elayoty <[email protected]>

---------

Signed-off-by: Heba Elayoty <[email protected]>
Co-authored-by: Filip Křepinský <[email protected]>
Co-authored-by: Heba Elayoty <[email protected]>
Added e2e_node test to verify that the Kubelet establishes only
a single gRPC connection with the DRA plugin for all service calls
during the plugin lifecycle.
The test uses a custom listener to count accepted connections and
asserts that only one connection is used for NodePrepareResources,
NodeUnprepareResources, and NodeWatchResources calls.
…prefix4

Ensure keys used in storage and cacher start with resourcePrefix
…test-connection-reuse

e2e_node: test DRA plugin gRPC connection reuse
…based-images

[go] Bump dependencies, images and versions used to Go 1.25.1 and distroless iptables
k8s-ci-robot and others added 24 commits October 1, 2025 10:12
Fix passing runtime.Object to HaveValidResourceVersion check
…to-get-list

Move metrics calculations to getList
This reverts commit 243f47f.

Not needed anymore, now that gengo calls 'gofmt -s'
On a given field or type, there may be multiple "cohorts" of validation
which need to be processed together.  For example, a short-circuit
validation and a non-short-circuit for the same subfield.

The unnamed cohort (aka the default cohort) is first, followed by named
cohorts in order they are created.  Named cohorts are emitted as inline
functions, so early-return can be used.

This allows tags like k8s:subfield or k8s:item to generate cohorts
(named after the field's JSON name or the selector string,
respectively).

Subsequent commits will add support to those tags.

Co-Authored-by: Yongrui Lin <[email protected]>
Given:

```
    // +k8s:subfield(name)=+k8s:optional
    // +k8s:subfield(name)=+k8s:format=dns-label
    // +k8s:subfield(generateName)=+k8s:optional
    // +k8s:subfield(generateName)=+k8s:format=dns-label
    metav1.ObjectMeta `json:"metadata,omitempty" protobuf:"bytes,1,opt,name=metadata"`
```

...we get:

```
  errs = append(errs,
    func(fldPath *field.Path, obj, oldObj *metav1.ObjectMeta) (errs field.ErrorList) {
      func() { // cohort name
        if e := validate.Subfield(ctx, op, fldPath, obj, oldObj, "name", func(o *metav1.ObjectMeta) *string { return &o.Name }, validate.OptionalValue); len (e) != 0 {
          return // do not proceed
        }
        errs = append(errs, validate.Subfield(ctx, op, fldPath, obj, oldObj, "name", func(o *metav1.ObjectMeta) *string { return &o.Name }, validate.DNSLabel)...)
      }()
      func() { // cohort generateName
        if e := validate.Subfield(ctx, op, fldPath, obj, oldObj, "generateName", func(o *metav1.ObjectMeta) *string { return &o.GenerateName }, validate.OptionalValue); len(e) != 0
 {
          return // do not proceed
        }
        errs = append(errs, validate.Subfield(ctx, op, fldPath, obj, oldObj, "generateName", func(o *metav1.ObjectMeta) *string { return &o.GenerateName }, validate.DNSLabel)...)
      }()
      return
    }(fldPath.Child("metadata"), &obj.ObjectMeta, safe.Field(oldObj, func(oldObj *corev1.ReplicationController) *metav1.ObjectMeta { return &oldObj.ObjectMeta }))...)
```

It is worth noting that this only works one level deep:

```
    // +k8s:subfield(foo)=+k8s:subfield(bar)=+k8s:optional
    // +k8s:subfield(foo)=+k8s:subfield(bar)=+k8s:format=dns-label
    Field StructType
```

...will generate one cohort ("foo") with two distinct calls to Subfield,
so the optional will not short-circuit format.  If we need deeper
cohorting we need to either make subfield take multiple field names
(e.g. `subfield(foo.bar)` or `subbfield(foo, bar)`) or we need to
accumulate "structure" in addition to validator calls.  This is similar
to how we handle multiple `eachVal` tags - each one iterates the list.
Given:

```
   // +k8s:item(stringKey: "target", intKey: 42, boolKey: true)=+k8s:validateFalse="item Items[stringKey=target,intKey=42,boolKey=true] 1"
   // +k8s:item(stringKey: "target", intKey: 42, boolKey: true)=+k8s:validateFalse="item Items[stringKey=target,intKey=42,boolKey=true] 2"
   Items []Item `json:"items"`
```

...we get:

```
   func() { // cohort {"stringKey": "target", "intKey": 42, "boolKey": true}
       errs = append(errs, validate.SliceItem(ctx, op, fldPath, obj, oldObj, func(item *Item) bool { return item.StringKey == "target" && item.IntKey == 42 && item.BoolKey == true }, validate.DirectEqual, func(ctx context.Context, op operation.Operation, fldPath *field.Path, obj, oldObj *Item) field.ErrorList {
           return validate.FixedResult(ctx, op, fldPath, obj, oldObj, false, "item Items[stringKey=target,intKey=42,boolKey=true] 1")
       })...)
       errs = append(errs, validate.SliceItem(ctx, op, fldPath, obj, oldObj, func(item *Item) bool { return item.StringKey == "target" && item.IntKey == 42 && item.BoolKey == true }, validate.DirectEqual, func(ctx context.Context, op operation.Operation, fldPath *field.Path, obj, oldObj *Item) field.ErrorList {
           return validate.FixedResult(ctx, op, fldPath, obj, oldObj, false, "item Items[stringKey=target,intKey=42,boolKey=true] 2")
       })...)
   }()
```
This new function does 1 main thing (changes how name validation works),
and the rest is future-proofing.  Specifically this changes the
signature of the name validation function.

Before: Name validation functions return `[]string` and apimachinery
logic wraps those in `field.Invalid()`.

Problem: This makes it hard to incrementally add declarative validation,
which needs access to potential name and generateName errors to set
origins properly.

After: New name validation functions return `field.ErrorList`, which can
be wrapped in trivial functions to do things like "mark as covered by
declarative" in just one place rather than for all types.

The "WithOpts" part of this gives us a new function name which isn't
horrible and is plausibly useful in the future.

Another notable change is that this no longer directly validates the
`generateName` field (but only on the new path -- existing code still
behaves the same).  From the code:

```
+   // generateName is not directly validated here. Types can have
+   // different rules for name generation, and the nameFn is for validating
+   // the post-generation data, not the input. In the past we assumed that
+   // name generation was always "append 5 random characters", but that's not
+   // NECESSARILY true. Also, the nameFn should always be considering the max
+   // length of the name, and it doesn't know enough about the name generation
+   // to do that. Also, given a bad generateName, the user will get errors
+   // for both the generateName and name fields. We will focus validation on
+   // the name field, which should give a better UX overall.
```

This also beefs up tests a bit.
This relies on `+k8s:subfield` and validation cohorts.  The
`k8s:optional` ensures that we don't run the name validation if name is
empty, because core apimachinery will already flag it as Required().

This demonstrates some of the DV value - docs and clients are now (in
theory) able to see what RC's name format is.

Co-Authored-by: Yongrui Lin <[email protected]>
Everyone who referenced it now uses the underlying function.  Clearer
and frees me up to change objectMeta validation without impacting anyone
else.
Once we go broad, people will copy these.  Let's make them easy to debug
:)
…ntextual-logs

Add RunWithContext method for debugsocket
It is currently not supported.
…_maxitems

Add declarative validation +k8s:maxItems tag to ResourceClaim
feat(validation-gen): Add "cohorts" & Tighten and simplify test framework
[InPlacePodVerticalScaling] Expand coverage for resourceQuota and limitRanger e2e tests
@upodroid upodroid force-pushed the cos-patch branch 3 times, most recently from 9e82632 to 3b1f7b3 Compare October 2, 2025 13:48
upodroid pushed a commit that referenced this pull request Oct 11, 2025
Instead of creating a new test case, the permutation is passed down. This
enables adding the event numbers to the log output, which is useful to
understand better which output belongs to which input:

    === RUN   TestListPatchedResourceSlices/update-patch/2_3_0_1
    tracker.go:396: I0929 14:28:40.032318] event #1: ResourceSlice add slice="s1"
    tracker.go:581: I0929 14:28:40.032404] event #1: syncing ResourceSlice resourceslice="s1"
    tracker.go:659: I0929 14:28:40.032446] event #1: ResourceSlice synced resourceslice="s1" change="add"
    tracker.go:396: I0929 14:28:40.032502] event kubernetes#2: ResourceSlice add slice="s2"
    tracker.go:581: I0929 14:28:40.032536] event kubernetes#2: syncing ResourceSlice resourceslice="s2"
    tracker.go:659: I0929 14:28:40.032568] event kubernetes#2: ResourceSlice synced resourceslice="s2" change="add"
    tracker.go:463: I0929 14:28:40.032609] event #0/#0: DeviceTaintRule add patch="rule"
    tracker.go:581: I0929 14:28:40.032639] event #0/#0: syncing ResourceSlice resourceslice="s1"
    tracker.go:703: I0929 14:28:40.032675] event #0/#0: processing DeviceTaintRule resourceslice="s1" deviceTaintRule="rule"
    tracker.go:807: I0929 14:28:40.032712] event #0/#0: applying matching DeviceTaintRule resourceslice="s1" deviceTaintRule="rule" device="driver1.example.com/pool-1/device-1"
    tracker.go:868: I0929 14:28:40.032780] event #0/#0: Assigned new taint ID, no matching taint resourceslice="s1" deviceTaintRule="rule" device="driver1.example.com/pool-1/device-1" taintID=0 taint="example.com/taint=tainted:NoExecute"
    tracker.go:654: I0929 14:28:40.033023] event #0/#0: ResourceSlice synced resourceslice="s1" change="update" diff=<
        	@@ -23,7 +23,32 @@
        	     "BindingConditions": null,
        	     "BindingFailureConditions": null,
        	     "AllowMultipleAllocations": null,
        	-    "Taints": null
        	+    "Taints": [
        	+     {
        	+      "Rule": {
        	+       "metadata": {
        	+        "name": "rule"
        	+       },
        	+       "spec": {
        	+        "deviceSelector": {
        	+         "pool": "pool-1"
        	+        },
        	+        "taint": {
        	+         "key": "example.com/taint",
        	+         "value": "tainted",
        	+         "effect": "NoExecute",
        	+         "timeAdded": "2006-01-02T15:04:05Z"
        	+        }
        	+       },
        	+       "status": {}
        	+      },
        	+      "ID": 1,
        	+      "key": "example.com/taint",
        	+      "value": "tainted",
        	+      "effect": "NoExecute",
        	+      "timeAdded": "2006-01-02T15:04:05Z"
        	+     }
        	+    ]
        	    }
        	   ],
        	   "Taints": null,
         >
    tracker.go:482: I0929 14:28:40.033224] event #0/#1: DeviceTaintRule update patch="rule" diff=<
        	@@ -4,7 +4,7 @@
        	  },
        	  "spec": {
        	   "deviceSelector": {
        	-   "pool": "pool-1"
        	+   "pool": "pool-2"
        	   },
        	   "taint": {
        	    "key": "example.com/taint",
         >
    tracker.go:581: I0929 14:28:40.033285] event #0/#1: syncing ResourceSlice resourceslice="s1"
    tracker.go:703: I0929 14:28:40.033319] event #0/#1: processing DeviceTaintRule resourceslice="s1" deviceTaintRule="rule"
    tracker.go:654: I0929 14:28:40.033478] event #0/#1: ResourceSlice synced resourceslice="s1" change="update" diff=<
        	@@ -23,32 +23,7 @@
        	     "BindingConditions": null,
        	     "BindingFailureConditions": null,
        	     "AllowMultipleAllocations": null,
        	-    "Taints": [
        	-     {
        	-      "Rule": {
        	-       "metadata": {
        	-        "name": "rule"
        	-       },
        	-       "spec": {
        	-        "deviceSelector": {
        	-         "pool": "pool-1"
        	-        },
        	-        "taint": {
        	-         "key": "example.com/taint",
        	-         "value": "tainted",
        	-         "effect": "NoExecute",
        	-         "timeAdded": "2006-01-02T15:04:05Z"
        	-        }
        	-       },
        	-       "status": {}
        	-      },
        	-      "ID": 1,
        	-      "key": "example.com/taint",
        	-      "value": "tainted",
        	-      "effect": "NoExecute",
        	-      "timeAdded": "2006-01-02T15:04:05Z"
        	-     }
        	-    ]
        	+    "Taints": null
        	    }
        	   ],
        	   "Taints": null,
         >
    tracker.go:581: I0929 14:28:40.033601] event #0/#1: syncing ResourceSlice resourceslice="s2"
    tracker.go:703: I0929 14:28:40.033633] event #0/#1: processing DeviceTaintRule resourceslice="s2" deviceTaintRule="rule"
    ...

Disabling event checking only worked when actually running all sub-tests. When
selectively running only one permutation with -run, the boolean variable was
wrong:

    $ go test -run='.*/^update-patch$' ./staging/src/k8s.io/dynamic-resource-allocation/resourceslice/tracker/
    ok  	k8s.io/dynamic-resource-allocation/resourceslice/tracker

    $ go test -run='.*/^update-patch$/3_2_0_1' ./staging/src/k8s.io/dynamic-resource-allocation/resourceslice/tracker/
    --- FAIL: TestListPatchedResourceSlices (0.01s)
        --- FAIL: TestListPatchedResourceSlices/update-patch (0.00s)
            --- FAIL: TestListPatchedResourceSlices/update-patch/3_2_0_1 (0.00s)

                tracker_test.go:762:
                     	Error Trace:	/nvme/gopath/src/k8s.io/kubernetes/staging/src/k8s.io/dynamic-resource-allocation/resourceslice/tracker/tracker_test.go:762
                     	            				/nvme/gopath/src/k8s.io/kubernetes/staging/src/k8s.io/dynamic-resource-allocation/resourceslice/tracker/tracker_test.go:856
                    	Error:      	Not equal:
                    	            	expected: []tracker.handlerEvent{tracker.handlerEvent{event:"add", oldObj:(*api.ResourceSlice)(nil), newObj:(*api.ResourceSlice)(0xc000301d40)}, tracker.handlerEvent{event:"add", oldObj:(*api.ResourceSlice)(nil), newObj:(*api.ResourceSlice)(0xc000346000)}}
                    	            	actual  : []tracker.handlerEvent{tracker.handlerEvent{event:"add", oldObj:(*api.ResourceSlice)(nil), newObj:(*api.ResourceSlice)(0xc0001f9ba0)}, tracker.handlerEvent{event:"add", oldObj:(*api.ResourceSlice)(nil), newObj:(*api.ResourceSlice)(0xc000301d40)}, tracker.handlerEvent{event:"update", oldObj:(*api.ResourceSlice)(0xc000301d40), newObj:(*api.ResourceSlice)(0xc0003dba00)}, tracker.handlerEvent{event:"update", oldObj:(*api.ResourceSlice)(0xc0003dba00), newObj:(*api.ResourceSlice)(0xc000301d40)}, tracker.handlerEvent{event:"update", oldObj:(*api.ResourceSlice)(0xc0001f9ba0), newObj:(*api.ResourceSlice)(0xc0003dbba0)}}

Now permutations are detected automatically based on the indices.

While at it, documentation gets moved around a bit to make reading test cases
easier without going to the implementation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.