Skip to content

Conversation

BlaineEXE
Copy link

@BlaineEXE BlaineEXE commented Apr 25, 2024

  • One-line PR description: begin v1alpha2 doc - initial work is fixing old errors without adding large features
  • Other comments:

Update the COSI KEP's design to v1alpha2.

This comprises many changes including:

  • removal of unused details from v1alpha1 discussions
  • controller/sidecar architecture changes modeled after the volume
    snapshot KEP
  • from user feedback, the BucketAccess secret now uses individual
    fields instead of the bucketInfo JSON blob
  • from user feedback, add Read/Write access mode to BucketAccess
  • from user feedback, allow BucketAccesses to reference multiple
    BucketClaims

More specific notes:

Notably, bucketInfo.json has been changed to individual secret fields
with COSI_<KEY>: <VALUE> format, as the JSON blob was flagged as a
problem by several v1alpha1 users.

Additionally, rework the existing APIs/spec to give driver sidecars fewer responsibilities and take on more coordination responsibility in the main COSI controller. This mirrors the implementation of volume snapshotter uses and should help keep version mismatch issues between sidecar/controller less frequent. It also means the sidecar -- and thus vendor drivers -- require fewer RBAC permissions.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Apr 25, 2024
@k8s-ci-robot k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Apr 25, 2024
@BlaineEXE BlaineEXE force-pushed the cosi-v1alpha2-changes branch 3 times, most recently from d22d1bd to 511d6cc Compare April 25, 2024 22:01
@BlaineEXE BlaineEXE force-pushed the cosi-v1alpha2-changes branch from 511d6cc to 5aa494b Compare April 25, 2024 23:26
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Apr 25, 2024
@BlaineEXE BlaineEXE marked this pull request as draft April 25, 2024 23:27
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 25, 2024
@BlaineEXE BlaineEXE force-pushed the cosi-v1alpha2-changes branch from 5aa494b to 52275b7 Compare April 25, 2024 23:40
@pacoxu pacoxu mentioned this pull request Jul 25, 2024
8 tasks
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 23, 2024
@xing-yang
Copy link
Contributor

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 23, 2024
@BlaineEXE BlaineEXE force-pushed the cosi-v1alpha2-changes branch from f55b3d3 to 11d78f4 Compare February 24, 2025 23:10
@BlaineEXE BlaineEXE force-pushed the cosi-v1alpha2-changes branch from 9098c29 to 14b42fa Compare March 11, 2025 22:11

COSI is out-of-tree, so version skew strategy is N/A

## Alternatives Considered
Copy link
Author

@BlaineEXE BlaineEXE Mar 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section contains sections that describe design details that we have discarded, including many changes between v1alpha1 and v1alpha2. I have taken care to try to make this a good effective overview and discussion of any complex points.

@BlaineEXE BlaineEXE force-pushed the cosi-v1alpha2-changes branch 2 times, most recently from dd4036c to 35934d7 Compare September 4, 2025 18:03
@BlaineEXE BlaineEXE force-pushed the cosi-v1alpha2-changes branch 2 times, most recently from 514bb8f to 6895463 Compare September 10, 2025 16:53
@BlaineEXE BlaineEXE requested a review from sp98 September 12, 2025 19:09
// BucketClaims is the list of BucketClaims this access should have permissions for.
// Multiple references to the same BucketClaim are not permitted.
// +required
BucketClaims []BucketClaimReference

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to get the appropriate information for a bucket if a BucketAccess refers multiple BucketClaims? Since we can specify only one bucket access secret's name, this secret would contain all envvars for these BucketClaims. However, this secret seems to be for storing one bucket's information.

ref. original comment in k8s slack

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started to prototype what a resolution might look like, and I think some deeper discussion will be needed. I created this issue to help have that discussion as an aside: kubernetes-sigs/container-object-storage-interface#143

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the KEP with proposed changes. Instead of referencing all Buckets/BucketClaims from one Secret, the current proposal is to use a different Secret for each referenced BucketClaim. The intent is to ensure that Secret fields are consistent (and thus portable) and as simple to use as possible for users. Putting all info in a single Secret would require very long names for the data keys, which seems to me like it is at odds with usability.

IMO, having individual Secret data keys is already somewhat at odds with usability since the fields change quite depending on the protocol and authentication type.

The typed BucketInfo output from v1a1 was nice to have structured output, but several users reported that it wasn't possible for them to translate the JSON blob into env vars needed for their applications, so I don't see a better way forward without another proposal

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for updating the KEP. Is the following my understanding correct?

  • If multiple Buckets are referred from one BucketAccess, the values of envvars for bucket info (e.g. COSI_S3_ENDPOINT) are different from each BucketAccess secret.
  • On the other hand, the values of envvars for credential info (e.g. COSI_S3_ACCESS_KEY_ID) are the same for all BucketAccess secrets.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, your understanding is correct based on my latest proposal.

As a minor clarification, I would say that the values of envvars for bucket info may be different for each BucketAccess secret, but they aren't always different.

Because of how the S3 client API works and how buckets/regions typically work, I would anticipate that the COSI_S3_ENDPOINT vars would end up being the same for all Secrets for most S3 providers. COSI_S3_BUCKET_ID would be different for each Secret instead.

@BlaineEXE BlaineEXE force-pushed the cosi-v1alpha2-changes branch 2 times, most recently from 6fc2575 to 7613fcd Compare October 3, 2025 19:36
@kannon92
Copy link
Contributor

kannon92 commented Oct 3, 2025

I asked on #prod-readiness if this is in scope for PRR.

If so, I believe @deads2k would need to take a look.

This was approved once before but it seems to have changed quite a bit so it may be worth asking for PRR review again.

@BlaineEXE
Copy link
Author

@kannon92 I'm not sure what PRR is in acronym. Please clarify.

If this is Pull Request Review, my understanding is that it has already been requested. @msau42 is the assigned reviewer from sig-storage. If there are other required reviewers, we haven't been made aware of that by our sig-storage liason or in the sig-storage biweekly meetings which we have been attending.

This is an open proposal, so all reviews, especially from the Kubernetes community interested in storage and object storage are welcome and appreciated. I am simply surprised to be seeing other reviewers added, and my perception is that new requirements are just now coming up after the COSI team has been following the process sig-storage has been guiding us through for years.

As a note, COSI is not an in-tree Kubernetes project. My understanding is that there should be no risk for impact on the Kube API server's operation itself because of this. I see @deads2k is active on API and control plane projects, and I suspect that the new request may be coming up due to some misunderstanding about this being a feature driven by the API server.

@kannon92
Copy link
Contributor

kannon92 commented Oct 3, 2025

I'm not sure what PRR is in acronym. Please clarify.

PRR is Production Readiness Review. You filled out the questionnaire in this PR.

@BlaineEXE BlaineEXE force-pushed the cosi-v1alpha2-changes branch 2 times, most recently from 3a6773b to 9fea6ae Compare October 3, 2025 22:47
@xing-yang
Copy link
Contributor

/assign @msau42

Comment on lines 1179 to 1180
The COSI resource APIs have breaking changes from v1alpha1 to v1alpha2, and migrations between versions are not automatically supported
The static provisioning workflow can be used to migrate existing v1alpha1 Buckets to v1alpha2.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Being out of tree, this is non-binding advice, but some consideration of how to migrate (or an explicit "cannot run side-by-side") may ease the change.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We appreciate the recommendation. I've updated this section to be more specific and provide a more complete high-level idea of what the upgrade strategy will look like, as well as notes about what COSI should document for users.

Related COSI doc issue: kubernetes-sigs/container-object-storage-interface#156

@deads2k
Copy link
Contributor

deads2k commented Oct 7, 2025

PRR isn't required for out of tree changes, but the questions are good for the sig to consider since they have direct impact on how people run and maintain the software in clusters. I'm going to mark this as "no need" in the project table

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: BlaineEXE, moonlight16, shanduur, sp98
Once this PR has been reviewed and has the lgtm label, please ask for approval from msau42. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Update the COSI KEP's design to v1alpha2.

This comprises many changes including:
- removal of unused details from v1alpha1 discussions
- controller/sidecar architecture changes modeled after the volume
  snapshot KEP
- from user feedback, the BucketAccess secret now uses individual
  fields instead of the bucketInfo JSON blob
- from user feedback, add Read/Write access mode to BucketAccess
- from user feedback, allow BucketAccesses to reference multiple
  BucketClaims

Co-authored-by: Mateusz Urbanek <[email protected]>
Signed-off-by: Blaine Gardner <[email protected]>
@BlaineEXE BlaineEXE force-pushed the cosi-v1alpha2-changes branch from daf2cea to 8448df0 Compare October 10, 2025 21:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.