Custom Plugin: Allow users to define condition startup behavior #645

notchairmk · 2022-02-13T19:35:47Z

Problem

There is currently no way for custompluginmonitor users to define startup behavior for ComponentCondition. Both custompluginmonitor and systemlogmonitor will emit a ComponentCondition with default Status: "False" during initialization for all monitored conditions.

Assuming that no problems exist at NPD startup doesn't always make sense. For example being, when the NPD container is restarting on a node. In this case, depending on restart frequency, the ConditionStatus on the node ends up flip-flopping between True and False/Unknown

Options

Option A: Allow user to specify startup behavior on the plugin

Example:

"plugin": "custom",
"source": "custom-health",
"pluginConfig": {
    "skip_initial_status": true
},
"conditions": [{
    "type": "KubeletUnhealthy",
    "reason": "KubeletIsHealthy",
    "message": "kubelet on the node is functioning properly"
}]

Pros:
- Allowing user to skip during initialization still falls within status API conventions
- Reduced complexity in how condition status/message change over time (compare to Option B)
- Stops status flip-flopping startup behavior
Cons:
- Does not allow for condition-specific behavior within one plugin use
- Ignoring initial status emitting logic would deviate from behavior in systemlogmonitor although could be updated there as well if perceived benefit

Option B: Allow user to specify ConditionStatus in the condition configuration

Example:

"plugin": "custom",
"source": "custom-health",
"conditions": [{
    "type": "KubeletUnhealthy",
    "reason": "KubeletIsHealthy",
    "message": "kubelet on the node is functioning properly",
    "status": "Unknown"
}]

Pros:
- Better behavior of flip-flopping statuses during unhealthy node events False <-> Unknown
- More granular customization of condition-specific startup behavior
Cons:
- Complexity in incorporating status/message changes over time (compare to Option A)
  - Specifically in above example the message may not make sense with an Unknown status and existing issues exist with multiple permanent rules for a condition
- Status would still be flapping.
- This would likely also need to be updated on systemlogmonitor

Option C: Allow user to specify startup behavior in the condition

Example implemented here:

"plugin": "custom",
"source": "custom-health",
"conditions": [{
    "type": "KubeletUnhealthy",
    "reason": "KubeletIsHealthy",
    "message": "kubelet on the node is functioning properly",
    "initialized": false
}]

Pros:
- Stops status flip-flopping from startup
- More granular customization of condition-specific startup behavior
Cons:
- Deviates from existing ComponentCondition definition

The text was updated successfully, but these errors were encountered:

notchairmk · 2022-03-28T21:47:01Z

/kind feature

k8s-triage-robot · 2022-06-26T22:33:22Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot · 2022-07-26T22:51:48Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

notchairmk · 2022-08-02T17:24:58Z

Closing, this was addressed in #646

notchairmk mentioned this issue Feb 13, 2022

Allow skipping condition during customplugin initialization #646

Merged

k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Mar 28, 2022

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 26, 2022

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jul 26, 2022

notchairmk closed this as completed Aug 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Custom Plugin: Allow users to define condition startup behavior #645

Custom Plugin: Allow users to define condition startup behavior #645

notchairmk commented Feb 13, 2022

notchairmk commented Mar 28, 2022

k8s-triage-robot commented Jun 26, 2022

k8s-triage-robot commented Jul 26, 2022

notchairmk commented Aug 2, 2022

Custom Plugin: Allow users to define condition startup behavior #645

Custom Plugin: Allow users to define condition startup behavior #645

Comments

notchairmk commented Feb 13, 2022

Problem

Options

Option A: Allow user to specify startup behavior on the plugin

Option B: Allow user to specify ConditionStatus in the condition configuration

Option C: Allow user to specify startup behavior in the condition

notchairmk commented Mar 28, 2022

k8s-triage-robot commented Jun 26, 2022

k8s-triage-robot commented Jul 26, 2022

notchairmk commented Aug 2, 2022