-
Notifications
You must be signed in to change notification settings - Fork 417
LSPS5: Correct notification cooldown & reset logic #3975
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LSPS5: Correct notification cooldown & reset logic #3975
Conversation
👋 Thanks for assigning @tnull as a reviewer! |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #3975 +/- ##
=======================================
Coverage 88.97% 88.98%
=======================================
Files 174 174
Lines 124161 124184 +23
Branches 124161 124184 +23
=======================================
+ Hits 110470 110500 +30
+ Misses 11216 11209 -7
Partials 2475 2475
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
fn default() -> Self { | ||
Self { | ||
max_webhooks_per_client: DEFAULT_MAX_WEBHOOKS_PER_CLIENT, | ||
notification_cooldown_hours: DEFAULT_NOTIFICATION_COOLDOWN_HOURS, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, not sure if we should make this user-configurable. But if we do, let's check on startup that they set it to at least 1 hour to comply with the spec.
.map_or(true, |duration| { | ||
duration >= self.config.notification_cooldown_hours.as_secs() | ||
.map_or(true, |last_sent| { | ||
last_sent >= self.config.notification_cooldown_hours.as_secs() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think duration
was clearer here, as we don't compare the absolute time.
let mut webhooks = self.webhooks.lock().unwrap(); | ||
if let Some(client_webhooks) = webhooks.get_mut(counterparty_node_id) { | ||
for webhook in client_webhooks.values_mut() { | ||
webhook.last_notification_sent.clear(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit orthogonal, but do we even need a separate last_used
field? Mind adding comments on the fields to explain how last_used
and last_notification_sent
differ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add comments explaining because it was even hard for me to remember the difference.
they are different because
- last_used is updated either if a notification is sent or if the webhook is updated (via set_webhook)
- last_notification_sent is a
Map<NotificationMethod, timestamp>
that is only updated when a notification is sent
last_notification_sent is used for the notification_cooldown logic, and last_used is used for the stale_webhooks logic
👋 The first review has been submitted! Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer. |
9e59322
to
9e0ff6f
Compare
9e0ff6f
to
a4a560b
Compare
@@ -118,6 +128,7 @@ impl LSPS5ProtocolError { | |||
LSPS5ProtocolError::AppNameNotFound => LSPS5_APP_NAME_NOT_FOUND_ERROR_CODE, | |||
LSPS5ProtocolError::UnknownError => LSPS5_UNKNOWN_ERROR_CODE, | |||
LSPS5ProtocolError::SerializationError => LSPS5_SERIALIZATION_ERROR_CODE, | |||
LSPS5ProtocolError::SlowDownError => LSPS5_SLOW_DOWN_ERROR_CODE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't get there in github, but it seems the above comment
/// private code range so we never collide with the spec's codes
is wrong given we match spec'd and non-spec'd codes here and elsewhere?
.map_or(false, |duration| duration < DEFAULT_NOTIFICATION_COOLDOWN_HOURS.as_secs()) | ||
}); | ||
|
||
if rate_limit_applies { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I think the spec says that we should disregard the rate-limit for WebhookRegistered
, right?
An LSP implementation must avoid sending multiple LSPS5 notifications of the same method (other than lsps5.webhook_registered) close in time, as long as the client has not connected to it.
|
||
// If the send_notification failed because of a SLOW_DOWN_ERROR, it means we sent this | ||
// notification recently, and the user has not seen it yet. It's safe to continue, but we still need to handle other error types. | ||
if result.is_err() && !matches!(result, Err(LSPS5ProtocolError::SlowDownError)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a classic match
would be preferable here. Also agree to continue, even though the counterparty shouldn't send the error for webhook_registered
in the first place, no?
thanks for the review @tnull I addressed your comments on this fixup commit 121dc04
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, please squash.
3adc266
to
8cafaf6
Compare
per method cooldown enforced correctly, and reset after peer_connected event
8cafaf6
to
bae3224
Compare
squash done! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simple enough, landing this!
/// Default maximum number of webhooks allowed per client. | ||
pub const DEFAULT_MAX_WEBHOOKS_PER_CLIENT: u32 = 10; | ||
/// Default notification cooldown time in hours. | ||
pub const DEFAULT_NOTIFICATION_COOLDOWN_HOURS: Duration = Duration::from_secs(60 * 60); // 1 hour |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still seems wayyyy too high. If we get two payments an hour apart, just because the first wakeup was missed doesn't mean we don't want to send another. A phone may be disconnected from the network or low on battery when the first was sent and the phone decided not to do a wakeup, but it may well be in a different state an hour later. Yea, we don't want to send 1000 notifications in an hour, but we definitely should be comfortable sending five in an hour. We either need to crank this way down (one a minute?) or track and allow bursts.
last_used: LSPSDateTime, | ||
// Map of last notification sent timestamps for each notification method. | ||
// This is used to enforce notification cooldowns. | ||
last_notification_sent: HashMap<WebhookNotificationMethod, LSPSDateTime>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EEk, we end up with a HashMap<_, HashMap<_, HashMap<_, _>>>
here, let's condense that please. First of all we probably don't need to track the last-sent notification on a per-webhook basis, it seems like it can be per-client.
Secondly, indexing this by WebhookNotificationMethod
means we track per-timeout
for LSPS5ExpirySoon
notifications, which I don't think is what we want. Once we fix that, we have 5 notification types, which definitely don't need to be a HashMap
, probably not even a Vec
, a [X; 5]
should do fine :).
@@ -318,10 +329,14 @@ where | |||
/// This builds a [`WebhookNotificationMethod::LSPS5PaymentIncoming`] webhook notification, signs it with your | |||
/// node key, and enqueues HTTP POSTs to all registered webhook URLs for that client. | |||
/// | |||
/// This may fail if a similar notification was sent too recently, | |||
/// violating the notification cooldown period defined in [`DEFAULT_NOTIFICATION_COOLDOWN_HOURS`]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All these links should be to the Config
not the default constant.
Addressing LSPS5 post merge follow ups, specifically comments #3944 (comment) and #3662 (comment)
From the LSPS5 spec:
This PR fixes a bug in the notification rate limiting logic. Previously, the same notification method (e.g., lsps5.payment_incoming) could not be sent again for 24 hours, regardless of whether the client reconnected (yikes). Now, the default cooldown is set to 1 hour, and the cooldown is reset whenever the client connects again. This ensures that after a client comes online, the LSP can immediately send new notifications of the same method.
Also adds a default for LSPS5/service configuration, which for some reason I didn't do before.