Skip to content

Conversation

avalluri
Copy link
Contributor

@avalluri avalluri commented May 21, 2019

With the current implementation, In delayed binding case, CSI driver is offered
with all nodes topology that are matched with 'selected node' topology keys in
CreateVolumeRequest.AccessibilityRequirements. So this allows the driver to
select any node from the passed preferred/requisite list to create volume. But this
results in scheduling failure when the volume created on a node other than
Kubernetes selected node.

To address this, introduced new flag "--strict-topology', when set, in case of
delayed binding, the driver is offered with only selected node topology, so that
the driver has to create the volume on this node.

This new flag can be used by drivers that support strict topology for volumes with delayed binding.

What type of PR is this?
/kind bug
/kind design

What this PR does / why we need it:
In case of delayed binding, creating volume on the different topology that is not accessed by the Kubernetes selected node for Pod scheduling, leads to unresolvable scheduling failures. So we should not allow the driver to create such volumes. We can avoid this by passing right/strict accessibility topologies instead of 'aggregated topology' to CreateVolume request.

Which issue(s) this PR fixes:
Fixes #221

Does this PR introduce a user-facing change?:
Yes

support strict topology for volumes with delayed binding

@k8s-ci-robot
Copy link
Contributor

Welcome @avalluri!

It looks like this is your first PR to kubernetes-csi/external-provisioner 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-csi/external-provisioner has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. kind/design Categorizes issue or PR as related to design. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels May 21, 2019
@k8s-ci-robot k8s-ci-robot requested review from davidz627 and lpabon May 21, 2019 12:57
@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels May 21, 2019
@k8s-ci-robot
Copy link
Contributor

Hi @avalluri. Thanks for your PR.

I'm waiting for a kubernetes-csi or kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link
Contributor

@pohly pohly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a README.md section which explains the topology support and add something for the new mode there?

@avalluri avalluri force-pushed the fix-late-binding branch from 8dea010 to 050a6e5 Compare May 22, 2019 15:33
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels May 22, 2019
@avalluri avalluri force-pushed the fix-late-binding branch from 050a6e5 to 5b784fb Compare May 23, 2019 08:43
@avalluri
Copy link
Contributor Author

Can you add a README.md section which explains the topology support and add something for the new mode there?

@pohly I tried adding a section to ReadMe that explains how AccessibilityRequirements are prepared. Can you please have a look if it is good enough.

@avalluri avalluri force-pushed the fix-late-binding branch 4 times, most recently from b6b30b0 to b970817 Compare May 24, 2019 08:40
@avalluri avalluri force-pushed the fix-late-binding branch from b970817 to bfda61a Compare May 25, 2019 20:54
@davidz627
Copy link
Contributor

/cc @msau42 @verult

@k8s-ci-robot k8s-ci-robot requested review from msau42 and verult May 28, 2019 21:04
@pohly
Copy link
Contributor

pohly commented May 29, 2019

/retest

Copy link
Contributor

@pohly pohly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description looks good to me now. For the code I'll defer to someone who is more familiar with it.

One more thing. Can you add a

support strict topology for volumes with delayed binding

to the PR description?

@avalluri avalluri force-pushed the fix-late-binding branch from bfda61a to 508be1a Compare May 29, 2019 12:41
@davidz627
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 29, 2019
Copy link
Collaborator

@msau42 msau42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This generally lgtm! Thanks for working on this! Can you also add a release note to the initial comment describing the new option?

/approve


enableLeaderElection = flag.Bool("enable-leader-election", false, "Enables leader election. If leader election is enabled, additional RBAC rules are required. Please refer to the Kubernetes CSI documentation for instructions on setting up these RBAC rules.")
leaderElectionType = flag.String("leader-election-type", "endpoints", "the type of leader election, options are 'endpoints' (default) or 'leases' (strongly recommended). The 'endpoints' option is deprecated in favor of 'leases'.")
strictTopology = flag.Bool("strict-topology", false, "Passes only selected node topology to CreateVolume Request, unlike default behavior of passing all nodes that match with topology keys of the selected node.")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To match the wording in the README:

"passing all nodes" => "passing aggregated cluster topologies"

if err != nil {
return nil, err
if selectedCSINode != nil && strictTopology {
// Make sure that selected node topology is in allowed topologies list
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably be more efficient and just assume Kubernetes does the right thing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but there is a test "topology from selected node is not in allowedTopologies" for this, so I added this check to satisfy the test.

requisiteTerms, err = aggregateTopologies(kubeClient, driverName, selectedCSINode)
if err != nil {
return nil, err
if selectedCSINode != nil && strictTopology {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be changed to a switch statement with the other 2 conditions, since all 3 are mutually exclusive from each other?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are not mutually exclusive. It's possible that both allowedTopologies and selectedNode set, and resulted topology depends on strictTopology value.

I could move this block to inside above if selectedNode != nil {..}

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: avalluri, msau42

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 4, 2019
@avalluri avalluri changed the title RFC: Introduce new flag - strict-topology Introduce new flag - strict-topology Jun 4, 2019
@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Jun 4, 2019
With the current implementation, In delayed binding case, CSI driver is offered
with all nodes topology that are matched with 'selected node' topology keys in
CreateVolumeRequest.AccessibilityRequirements. So this allows the driver to
select any node from the passed preferred list to create volume. But this
results in scheduling failure when the volume created on a node other than
Kubernetes selected node.

To address this, introduced new flag "--strict-topology', when set, in case of
delayed binding, the driver is offered with only selected node topology, so that
driver has to create the volume on this node.

Modified tests so that now every test is run with and without 'strict topology'.
@avalluri avalluri force-pushed the fix-late-binding branch from 508be1a to 5bd554b Compare June 4, 2019 09:50
@msau42
Copy link
Collaborator

msau42 commented Jun 4, 2019

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 4, 2019
@k8s-ci-robot k8s-ci-robot merged commit 1730a1e into kubernetes-csi:master Jun 4, 2019
avalluri added a commit to avalluri/pmem-CSI that referenced this pull request Jun 7, 2019
Our recent change
(kubernetes-csi/external-provisioner#282) got merged to
master which fixes late binding case. Till it appears in next release(v1.2) we
use canary builds which holds this change.
pohly pushed a commit to pohly/external-provisioner that referenced this pull request Jun 20, 2019
…strict-topology

With the current implementation, In delayed binding case, CSI driver is offered
with all nodes topology that are matched with 'selected node' topology keys in
CreateVolumeRequest.AccessibilityRequirements. So this allows the driver to
select any node from the passed preferred list to create volume. But this
results in scheduling failure when the volume created on a node other than
Kubernetes selected node.

To address this, introduced new flag "--strict-topology', when set, in case of
delayed binding, the driver is offered with only selected node topology, so that
driver has to create the volume on this node.

Modified tests so that now every test is run with and without 'strict topology'.
pohly pushed a commit to pohly/pmem-CSI that referenced this pull request Jun 26, 2019
Our recent change
(kubernetes-csi/external-provisioner#282) got merged to
master which fixes late binding case. Till it appears in next release(v1.2) we
use canary builds which holds this change.
pohly pushed a commit to pohly/pmem-CSI that referenced this pull request Jun 26, 2019
Our recent change
(kubernetes-csi/external-provisioner#282) got merged to
master which fixes late binding case. Till it appears in next release(v1.2) we
use canary builds which holds this change.
kbsonlong pushed a commit to kbsonlong/external-provisioner that referenced this pull request Dec 29, 2023
dfajmon added a commit to dfajmon/csi-external-provisioner that referenced this pull request Sep 5, 2025
5f38a9075 Merge pull request kubernetes-csi#282 from rhrmo/update-go-1.24.6
579f62421 Update go to 1.24.6
74e066a82 Merge pull request kubernetes-csi#279 from Aishwarya-Hebbar/update-csi-prow-version
6f236be7d Update CSI prow driver version to v1.17.0
0ee55894b Merge pull request kubernetes-csi#280 from xing-yang/update_go_1.24.4
9af101534 update to go 1.24.4
f5fec3e36 Merge pull request kubernetes-csi#275 from chrishenzie/emeritus
c5d285db8 Remove chrishenzie from kubernetes-csi-reviewers

git-subtree-dir: release-tools
git-subtree-split: 5f38a907597230563f5e2213aea116acdd9d86bc
dfajmon added a commit to dfajmon/csi-external-provisioner that referenced this pull request Sep 5, 2025
5f38a9075 Merge pull request kubernetes-csi#282 from rhrmo/update-go-1.24.6
579f62421 Update go to 1.24.6
74e066a82 Merge pull request kubernetes-csi#279 from Aishwarya-Hebbar/update-csi-prow-version
6f236be7d Update CSI prow driver version to v1.17.0
0ee55894b Merge pull request kubernetes-csi#280 from xing-yang/update_go_1.24.4
9af101534 update to go 1.24.4
f5fec3e36 Merge pull request kubernetes-csi#275 from chrishenzie/emeritus
c5d285db8 Remove chrishenzie from kubernetes-csi-reviewers

git-subtree-dir: release-tools
git-subtree-split: 5f38a907597230563f5e2213aea116acdd9d86bc
dfajmon added a commit to dfajmon/csi-external-provisioner that referenced this pull request Sep 5, 2025
5f38a9075 Merge pull request kubernetes-csi#282 from rhrmo/update-go-1.24.6
579f62421 Update go to 1.24.6
74e066a82 Merge pull request kubernetes-csi#279 from Aishwarya-Hebbar/update-csi-prow-version
6f236be7d Update CSI prow driver version to v1.17.0
0ee55894b Merge pull request kubernetes-csi#280 from xing-yang/update_go_1.24.4
9af101534 update to go 1.24.4
f5fec3e36 Merge pull request kubernetes-csi#275 from chrishenzie/emeritus
c5d285db8 Remove chrishenzie from kubernetes-csi-reviewers

git-subtree-dir: release-tools
git-subtree-split: 5f38a907597230563f5e2213aea116acdd9d86bc
darshansreenivas added a commit to darshansreenivas/external-provisioner that referenced this pull request Oct 15, 2025
74502e544 Merge pull request kubernetes-csi#278 from liangyuanpeng/migrate_k8s_testimages
533443055 Merge pull request kubernetes-csi#281 from kubernetes-csi/dependabot/github_actions/actions/checkout-5
458ce146f Bump actions/checkout from 4 to 5
5f38a9075 Merge pull request kubernetes-csi#282 from rhrmo/update-go-1.24.6
579f62421 Update go to 1.24.6
5ec1a52b8 use gcr.io/k8s-staging-test-infra instead of gcr.io/k8s-testimages
74e066a82 Merge pull request kubernetes-csi#279 from Aishwarya-Hebbar/update-csi-prow-version
6f236be7d Update CSI prow driver version to v1.17.0
0ee55894b Merge pull request kubernetes-csi#280 from xing-yang/update_go_1.24.4
9af101534 update to go 1.24.4
f5fec3e36 Merge pull request kubernetes-csi#275 from chrishenzie/emeritus
c5d285db8 Remove chrishenzie from kubernetes-csi-reviewers

git-subtree-dir: release-tools
git-subtree-split: 74502e544bc6a17820892c0d490e8f0b59462998
darshansreenivas added a commit to darshansreenivas/external-provisioner that referenced this pull request Oct 15, 2025
74502e544 Merge pull request kubernetes-csi#278 from liangyuanpeng/migrate_k8s_testimages
533443055 Merge pull request kubernetes-csi#281 from kubernetes-csi/dependabot/github_actions/actions/checkout-5
458ce146f Bump actions/checkout from 4 to 5
5f38a9075 Merge pull request kubernetes-csi#282 from rhrmo/update-go-1.24.6
579f62421 Update go to 1.24.6
5ec1a52b8 use gcr.io/k8s-staging-test-infra instead of gcr.io/k8s-testimages
74e066a82 Merge pull request kubernetes-csi#279 from Aishwarya-Hebbar/update-csi-prow-version
6f236be7d Update CSI prow driver version to v1.17.0
0ee55894b Merge pull request kubernetes-csi#280 from xing-yang/update_go_1.24.4
9af101534 update to go 1.24.4
f5fec3e36 Merge pull request kubernetes-csi#275 from chrishenzie/emeritus
c5d285db8 Remove chrishenzie from kubernetes-csi-reviewers

git-subtree-dir: release-tools
git-subtree-split: 74502e544bc6a17820892c0d490e8f0b59462998
darshansreenivas added a commit to darshansreenivas/external-provisioner that referenced this pull request Oct 15, 2025
74502e544 Merge pull request kubernetes-csi#278 from liangyuanpeng/migrate_k8s_testimages
533443055 Merge pull request kubernetes-csi#281 from kubernetes-csi/dependabot/github_actions/actions/checkout-5
458ce146f Bump actions/checkout from 4 to 5
5f38a9075 Merge pull request kubernetes-csi#282 from rhrmo/update-go-1.24.6
579f62421 Update go to 1.24.6
5ec1a52b8 use gcr.io/k8s-staging-test-infra instead of gcr.io/k8s-testimages
74e066a82 Merge pull request kubernetes-csi#279 from Aishwarya-Hebbar/update-csi-prow-version
6f236be7d Update CSI prow driver version to v1.17.0
0ee55894b Merge pull request kubernetes-csi#280 from xing-yang/update_go_1.24.4
9af101534 update to go 1.24.4
f5fec3e36 Merge pull request kubernetes-csi#275 from chrishenzie/emeritus
c5d285db8 Remove chrishenzie from kubernetes-csi-reviewers

git-subtree-dir: release-tools
git-subtree-split: 74502e544bc6a17820892c0d490e8f0b59462998
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. kind/design Categorizes issue or PR as related to design. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Wrong AccessibilityRequirement passed in CreateVolumeRequest

5 participants