Skip to content

Remove all SP Implementation Info #192

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

akutz
Copy link
Contributor

@akutz akutz commented Feb 15, 2018

This patch removes all of the SP implementation information previously detailed in the specification. With respect to issue #171 and comments made by @saad-ali, @childsb, @jdef, and others, it has become apparent that the specification should not supply any suggestions or restrictions on matters related to an SP's packaging, deployment, or runtime configuration.

Removed information includes, but is not limited to the following:

Removed Description
CSI_ENDPOINT SPs are now responsible for defining the manner by which they grok the gRPC endpoint on which they serve the CSI services. CO admins are now responsible for deploying and configuring SPs using that SP's documentation.
Environment variable names The spec no longer claims ownership of the CSI_ prefix for environment variables.
Permissions The spec no longer asserts any recommendations with respect to container permissions that an SP might require.
Logging All logging details are the responsibility of the SP and may vary SP to SP.

As a reminder, the above list is not exhaustive or complete. Please see the patch for the full picture.

Guide to Developing CSI Storage Plug-ins

The removed information may very well be suitable to live in a separate document that provides guidelines for SP creation and configuration, such as common environment variable names, logging and packaging details, etc.

cc @codenrhoden

This patch removes all of the SP implementation information previously
detailed in the specification. With respect to issue container-storage-interface#171 and comments
made by @saad-ali, @childsb, @jdef, and others, it has become apparent
that the specification should not supply any suggestions or restrictions
on matters related to an SP's packaging, deployment, or runtime
configuration.

Removed information includes, but not limited to:

* `CSI_ENDPOINT` - SPs are now responsible for defining the
  manner by which they grok the gRPC endpoint on which they serve
  the CSI services. CO admins are now responsible for deploying
  and configuring SPs using that SP's documentation.

* Env var names - The spec no longer claims ownership of the `CSI_`
  prefix for environment variables.

* Permissions - The spec no longer asserts any recommendations with
  respect to container permissions that an SP might require.

* Logging - All logging details are the responsibility of the SP and
  may vary SP to SP.

The removed information may very well be suitable to live in a separate
document that provides guidelines for SP creation and configuration,
such as common environment variable names, logging and packaging details,
etc.

cc @codenrhoden
@saad-ali
Copy link
Member

I really wish we could make this change. Decoupling the spec completely from packaging and deployment would make it cleaner and more flexible. But as mentioned in #171 (comment), Mesos has a dependency on CSI_ENDPOINT, so we can't completely do away with all of it. Maybe we can whittle it down some.

@akutz
Copy link
Contributor Author

akutz commented Feb 16, 2018

Hi @saad-ali,

If the API specification includes SP implementation details for even one CO then it's clear that the API specification should define certain implementation details. For example, CSI_ENDPOINT is going to be the way all SPs and COs communicate the endpoint. It's not a de facto standard, it's de jure. If this is the case then the specification should absolutely define other configuration properties almost guaranteed to be common across all SPs, such as CSI_MODE.

When the specification defines a single SP environment variable the specification now owns a prefix that is for what? For nothing else? If the spec defines anything to do with implementation details then it should define the rest of the common ones too, such as CSI_MODE. Because just as Mesos deploys the SPs with CSI_ENDPOINT, some SPs will also need the equivalent of CSI_MODE, so why not make it, and possibly others, a standard?

@saad-ali
Copy link
Member

If we're going to bend the API specification to include implementation details for a single CO then it's clear that the API specification includes implementation details. For example, CSI_ENDPOINT is going to be the way all SPs and COs communicate the endpoint. It's not a de facto standard, it's de jure. If this is the case then the specification should absolutely define other configuration properties almost guaranteed to be common across all SPs, such as CSI_MODE.

Defining endpoint as an env var is an exception not the rule.

When the specification defines a single SP environment variable the specification now owns a prefix that is for what? For nothing else? If the spec defines anything to do with implementation details then it should define the rest of the common ones too, such as CSI_MODE. Because just as Mesos deploys the SPs with CSI_ENDPOINT, some SPs will also need the equivalent of CSI_MODE, so why not make it a standard?

AFAIK endpoint is critical for CSI to work with Mesos. But a single exception should not be justification to open the flood gates and abandon the principle altogether. Separating packaging and deployment from the interface is important and we should strive to do so as we continue to expand and revise the spec.

@akutz
Copy link
Contributor Author

akutz commented Feb 16, 2018

Hi @saad-ali,

If CSI_ENDPOINT is an exception then at least remove the language from the spec that declares CSI_ off-limits. Otherwise a very useful and obvious namespace is completely unused except for a single environment variable.

However, I would also argue that if a single CO already defines deployment/configuration details for an SP then it's not unlikely others will as well. My point is this -- how come Mesos or any CO requires CSI_ENDPOINT and not knowledge of the other properties required to configure an SP that CO is deploying? When one or more of those properties are identified as common, why not make it part of the spec?

@saad-ali
Copy link
Member

If CSI_ENDPOINT is an exception then at least remove the language from the spec that declares CSI_ off-limits. Otherwise a very useful and obvious namespace is completely unused except for a single environment variable.

That seems reasonable to me. @jieyu?

@jdef
Copy link
Member

jdef commented Feb 16, 2018 via email

@akutz
Copy link
Contributor Author

akutz commented Feb 16, 2018

This is having your cake and eating it too. If CSI_ENDPOINT is an exception then the spec should not declare the namespace reserved unless the intent is to provide additional de jure environment variable names. And if that's the case, then provide CSI_MODE for starters.

@jdef
Copy link
Member

jdef commented Feb 22, 2018

We'd like to keep the envvar surface area small. CSI_ENDPOINT is the only envvar for which the spec currently defines meaning. If we open up the CSI_.* envvar namespace in general, and then later need to add CSI_FOO at some point to address a critical use case, how can we reliably do that w/o possibly stomping on 3rd parties?

Just because I think the spec should continue to reserve the CSI_ envvar namespace doesn't imply that the spec needs to define any additional envvars other than what's already in the spec. The bar for introducing any additional environment requirements should be very high .. and CSI_MODE doesn't clear that bar for me.

* Only UNIX Domain Sockets may be used as endpoints.
This will likely change in a future version of this specification to support non-UNIX platforms.
* All supported RPC services MUST be available at the listen address of the Plugin.
* All Plugins MUST support UNIX Domain Sockets as a gRPC endpoint.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this statement.

##### Logging

* Plugins SHOULD generate log messages to ONLY standard output and/or standard error.
* In this case the Plugin Supervisor SHALL assume responsibility for all log lifecycle management.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of removing this block, we could change SHALL to SHOULD


##### Linux Capabilities

* Plugin Supervisor SHALL guarantee that plugins will have `CAP_SYS_ADMIN` capability on Linux when running on Nodes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need this so that plugins can execute mount operations. i don't think it's wise to remove from the spec.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FUSE based plugins don't require CAP_SYS_ADMIN. NFS plugins probably want CAP_NET_BIND_SERVICE. We have a plugin in CloudFoundry that needs CAP_SETUID. So to me, this seems like it is only true part of the time, and not really complete. I'd be ok with removing it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree it's ok to remove, this is a packaging concern and not related to how the CO and the SP can interact.

@jdef jdef added the lifecycle/stale PR has been inactive for 90+ days. label Oct 4, 2018
@saad-ali
Copy link
Member

saad-ali commented Nov 7, 2018

@saad-ali saad-ali added lifecycle/rotten Stale PR has been inactive for 30+ days. and removed lifecycle/stale PR has been inactive for 90+ days. labels Nov 7, 2018
@saad-ali
Copy link
Member

saad-ali commented Mar 6, 2019

@saad-ali saad-ali closed this Mar 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Stale PR has been inactive for 30+ days.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants