-
Notifications
You must be signed in to change notification settings - Fork 64
🐛 Set recommended leaderelection settings #1663
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 Set recommended leaderelection settings #1663
Conversation
✅ Deploy Preview for olmv1 ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
Cross-ref catalogd PR: |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1663 +/- ##
==========================================
- Coverage 67.42% 67.37% -0.05%
==========================================
Files 55 55
Lines 4632 4644 +12
==========================================
+ Hits 3123 3129 +6
- Misses 1284 1290 +6
Partials 225 225
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
// https://github.com/openshift/enhancements/blob/61581dcd985130357d6e4b0e72b87ee35394bf6e/CONVENTIONS.md#handling-kube-apiserver-disruption | ||
LeaseDuration: ptr.To(137 * time.Second), | ||
RenewDeadline: ptr.To(107 * time.Second), | ||
RetryPeriod: ptr.To(26 * time.Second), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
I think we need to do the same for catalogd : https://github.com/operator-framework/operator-controller/blob/main/catalogd/cmd/catalogd/main.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done :)
0618e88
to
63e2507
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am OK with those changes 👍
/lgtm
63e2507
to
3983002
Compare
@thetechnick Looks like more than 5000+ files changes for the PR, seems like a rebase gone wrong or something similar. |
Oh @thetechnick I think you just overview. |
@@ -42,6 +42,7 @@ import ( | |||
_ "k8s.io/client-go/plugin/pkg/client/auth" | |||
"k8s.io/klog/v2" | |||
"k8s.io/klog/v2/textlogger" | |||
"k8s.io/utils/ptr" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it make sense to try and avoid pulling in a new dependency just for a small helper that returns a pointer to value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't a new dependency:
Line 38 in f055efc
k8s.io/utils v0.0.0-20241210054802-24370beab758 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that 5000k lines vendor tricked me :)
@thetechnick can you remove |
+1 on removing |
Extensive e2e tests revealed that operator-controller might run into leader election timeouts during cluster bootstrap, causing sporadic alerts being generated. This commit uses recommended settings for leaderelection LeaseDuration: 15s -> 137s RenewDeadline: 10s -> 107s RetryPeriod: 2s -> 26s Warning: This will increase potential down-time of catalogd to 163s in the worst case (up from 17s). (LeaseDuration + RetryPeriod)
3983002
to
92cece5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
037b9e2
Description
Extensive e2e tests revealed that operator-controller and catalogd might run into leader election timeouts during cluster bootstrap, causing sporadic alerts being generated.
This commit uses recommended settings for leaderelection:
Warning: This will increase potential down-time of our components to 163s in the worst case (up from 17s). (LeaseDuration + RetryPeriod)
Reviewer Checklist