Skip to content

Conversation

wking
Copy link
Member

@wking wking commented Sep 17, 2025

Instead of carring similar code in the payload and Cincinnati packages, centralize in one place to make maintenance easier. And with the logic centralized, I've also put some time into hardening the channel parsing, to grumble about some possible issues.

For payload loading, errors get a logged warning, but are not fatal. Having the CVO come up with logged warnings still allows cluster admins to update into a fix. But a crash-looping CVO would not notice ClusterVersion spec.desiredUpdate changes or be able to roll out a requested update into a fix.

For Cincinnati processing, errors are fatal. Cluster admins can take the ClusterVersion RetrievedUpdates=False message and complain to their Update Service maintainers, who can fix the metadata.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Sep 17, 2025
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Sep 17, 2025

@wking: This pull request references OTA-1627 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.21.0" version, but no target version was set.

In response to this:

Instead of carring similar code in the payload and Cincinnati packages, centralize in one place to make maintenance easier. And with the logic centralized, I've also put some time into hardening the channel parsing, to grumble about some possible issues.

For payload loading, errors get a logged warning, but are not fatal. Having the CVO come up with logged warnings still allows cluster admins to update into a fix. But a crash-looping CVO would not notice ClusterVersion spec.desiredUpdate changes or be able to roll out a requested update into a fix.

For Cincinnati processing, errors are fatal. Cluster admins can take the ClusterVersion RetrievedUpdates=False message and complain to their Update Service maintainers, who can fix the metadata.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Contributor

openshift-ci bot commented Sep 17, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 17, 2025
@wking wking force-pushed the log-release-metadata-parsing branch 3 times, most recently from 763a55e to b200ebc Compare September 17, 2025 02:02
Instead of carring similar code in the payload and Cincinnati
packages, centralize in one place to make maintenance easier.  And
with the logic centralized, I've also put some time into hardening the
channel parsing, to grumble about some possible issues.

For payload loading, errors get a logged warning, but are not fatal.
Having the CVO come up with logged warnings still allows cluster
admins to update into a fix.  But a crash-looping CVO would not notice
ClusterVersion spec.desiredUpdate changes or be able to roll out a
requested update into a fix.

For Cincinnati processing, errors are fatal.  Cluster admins can take
the ClusterVersion RetrievedUpdates=False message and complain to
their Update Service maintainers, who can fix the metadata.
@wking wking force-pushed the log-release-metadata-parsing branch from b200ebc to b3f754f Compare September 17, 2025 02:33
Copy link
Contributor

openshift-ci bot commented Sep 17, 2025

@wking: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/okd-scos-e2e-aws-ovn b3f754f link false /test okd-scos-e2e-aws-ovn
ci/prow/e2e-aws-ovn-techpreview b3f754f link true /test e2e-aws-ovn-techpreview
ci/prow/e2e-hypershift b3f754f link true /test e2e-hypershift

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

wking added a commit to wking/cincinnati-graph-data that referenced this pull request Sep 19, 2025
wking added a commit to wking/cincinnati-graph-data that referenced this pull request Sep 19, 2025
@wking
Copy link
Member Author

wking commented Sep 19, 2025

Testing with a launch 4.21,openshift/cluster-version-operator#1231 aws Cluster Bot run (logs):

$ oc get -o jsonpath-as-json='{.status.desired}' clusterversion version
[
    {
        "image": "registry.build11.ci.openshift.org/ci-ln-mwhnlck/release@sha256:5c287af573279895ba93fe562345bb0b38caae30c79a79a23b7abf66e3337fc4",
        "version": "4.21.0-0-2025-09-19-155315-test-ci-ln-mwhnlck-latest"
    }
]
$ curl -s https://github.com/raw/wking/cincinnati-graph-data/refs/heads/demo/cincinnati-graph.json | jq '.nodes[0]'
{
  "payload": "registry.build11.ci.openshift.org/ci-ln-mwhnlck/release@sha256:5c287af573279895ba93fe562345bb0b38caae30c79a79a23b7abf66e3337fc4",
  "version": "4.21.0-0-2025-09-19-155315-test-ci-ln-mwhnlck-latest",
  "metadata": {
    "url": "https://example.com/current",
    "io.openshift.upgrades.graph.release.channels": "channel-a,channel-current"
  }
}
$ oc patch clusterversion version --type json -p '[{"op": "add", "path": "/spec/channel", "value": "whatever"}, {"op": "add", "path": "/spec/upstream", "value": "https://github.com/raw/wking/cincinnati-graph-data/refs/heads/demo/cincinnati-graph.json"}]'
$ oc adm upgrade recommend
Upstream update service: https://github.com/raw/wking/cincinnati-graph-data/refs/heads/demo/cincinnati-graph.json
Channel: whatever (available channels: channel-a, channel-current)

Updates to 4.22:
  VERSION                   ISSUES
  4.22.1-always-recommended no known issues relevant to this cluster
  4.22.0-always-recommended no known issues relevant to this cluster
$ oc get -o json clusterversion version | jq '.status.availableUpdates[]'
{
  "image": "quay.io/openshift-release-dev/ocp-release@sha256:1111111111111111111111111111111111111111111111111111111111111111",
  "version": "4.22.1-always-recommended"
}
{
  "channels": [
    "channel-0",
    "channel-a"
  ],
  "image": "quay.io/openshift-release-dev/ocp-release@sha256:0000000000000000000000000000000000000000000000000000000000000000",
  "url": "https://example.com/0",
  "version": "4.22.0-always-recommended"
}

Looks good to me. I haven't reproduced the original issue, so I'm not sure this pull fixes anything. I suspect earlier tests might have missed pullspec-digest matching in the mock update service data, since we use digest matching when merging release metadata. But even if this isn't a user-visible fix, I think it's still worth unifying the parsing for tech-debt reduction and easier future maintenance.

/verified by @wking

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Sep 19, 2025
@openshift-ci-robot
Copy link
Contributor

@wking: This PR has been marked as verified by @wking.

In response to this:

Testing with a launch 4.21,openshift/cluster-version-operator#1231 aws Cluster Bot run (logs):

$ oc get -o jsonpath-as-json='{.status.desired}' clusterversion version
[
   {
       "image": "registry.build11.ci.openshift.org/ci-ln-mwhnlck/release@sha256:5c287af573279895ba93fe562345bb0b38caae30c79a79a23b7abf66e3337fc4",
       "version": "4.21.0-0-2025-09-19-155315-test-ci-ln-mwhnlck-latest"
   }
]
$ curl -s https://github.com/raw/wking/cincinnati-graph-data/refs/heads/demo/cincinnati-graph.json | jq '.nodes[0]'
{
 "payload": "registry.build11.ci.openshift.org/ci-ln-mwhnlck/release@sha256:5c287af573279895ba93fe562345bb0b38caae30c79a79a23b7abf66e3337fc4",
 "version": "4.21.0-0-2025-09-19-155315-test-ci-ln-mwhnlck-latest",
 "metadata": {
   "url": "https://example.com/current",
   "io.openshift.upgrades.graph.release.channels": "channel-a,channel-current"
 }
}
$ oc patch clusterversion version --type json -p '[{"op": "add", "path": "/spec/channel", "value": "whatever"}, {"op": "add", "path": "/spec/upstream", "value": "https://github.com/raw/wking/cincinnati-graph-data/refs/heads/demo/cincinnati-graph.json"}]'
$ oc adm upgrade recommend
Upstream update service: https://github.com/raw/wking/cincinnati-graph-data/refs/heads/demo/cincinnati-graph.json
Channel: whatever (available channels: channel-a, channel-current)

Updates to 4.22:
 VERSION                   ISSUES
 4.22.1-always-recommended no known issues relevant to this cluster
 4.22.0-always-recommended no known issues relevant to this cluster
$ oc get -o json clusterversion version | jq '.status.availableUpdates[]'
{
 "image": "quay.io/openshift-release-dev/ocp-release@sha256:1111111111111111111111111111111111111111111111111111111111111111",
 "version": "4.22.1-always-recommended"
}
{
 "channels": [
   "channel-0",
   "channel-a"
 ],
 "image": "quay.io/openshift-release-dev/ocp-release@sha256:0000000000000000000000000000000000000000000000000000000000000000",
 "url": "https://example.com/0",
 "version": "4.22.0-always-recommended"
}

Looks good to me. I haven't reproduced the original issue, so I'm not sure this pull fixes anything. I suspect earlier tests might have missed pullspec-digest matching in the mock update service data, since we use digest matching when merging release metadata. But even if this isn't a user-visible fix, I think it's still worth unifying the parsing for tech-debt reduction and easier future maintenance.

/verified by @wking

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@wking
Copy link
Member Author

wking commented Sep 19, 2025

/payload-job periodic-ci-openshift-release-master-nightly-4.21-e2e-aws-ovn-serial

Copy link
Contributor

openshift-ci bot commented Sep 19, 2025

@wking: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-master-nightly-4.21-e2e-aws-ovn-serial

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/d94eb820-9581-11f0-91b5-e05e7629a7c5-0

@petr-muller
Copy link
Member

/cc

@openshift-ci openshift-ci bot requested a review from petr-muller September 19, 2025 18:29
@JianLi-RH
Copy link
Contributor

/label qe-approved

@openshift-ci openshift-ci bot added the qe-approved Signifies that QE has signed off on this PR label Sep 22, 2025
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Sep 22, 2025

@wking: This pull request references OTA-1627 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.21.0" version, but no target version was set.

In response to this:

Instead of carring similar code in the payload and Cincinnati packages, centralize in one place to make maintenance easier. And with the logic centralized, I've also put some time into hardening the channel parsing, to grumble about some possible issues.

For payload loading, errors get a logged warning, but are not fatal. Having the CVO come up with logged warnings still allows cluster admins to update into a fix. But a crash-looping CVO would not notice ClusterVersion spec.desiredUpdate changes or be able to roll out a requested update into a fix.

For Cincinnati processing, errors are fatal. Cluster admins can take the ClusterVersion RetrievedUpdates=False message and complain to their Update Service maintainers, who can fix the metadata.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. qe-approved Signifies that QE has signed off on this PR verified Signifies that the PR passed pre-merge verification criteria
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants