Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Process monitor update events in block_[dis]connected asynchronously #808
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Process monitor update events in block_[dis]connected asynchronously #808
Changes from all commits
8902676
23c3278
280de80
ba6eee2
d481008
5351f9f
93a7572
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can assert the lock consistency requirement
assert(&self.total_consistency_lock.try_write().is_some())
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, it only needs a read lock, not a write lock, and there's no way to assert that the current thread holds one - we could assert that no thread holds a write lock, but that's a not quite sufficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. Seems a bit weird to just let the update die silently. Should we log? Also, should we put it back in the event queue and give it 3 chances to succeed, or so? Seems like a loose end to trust the counterparty to get the commit tx confirmed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, we don't really handle it anywhere else - in the case of a TemporaryFailure the API requires users to have stored it somewhere (as we never provide duplicate/old monitor updates). In the case of a permanent failure, indeed, we're a little hosed, but that isn't an issue specific to this case - in any permanent failure case if the final force-closure monitor update fails to be delivered the user will need to manually intervene and call the relevant method to get the latest commitment transaction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't the update succeed but persistence fail? Would this be a problem to ignore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean "the update succeed but persistence fail"? Monitor Update success includes persistence, but I'm not sure what exactly you mean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, what I meant was
Watch::update_channel
includes both updating the channel monitor (i.e.,ChannelMonitor::update_monitor
) and persisting it (i.e.,Persist::update_persisted_channel
). Though, I suppose the errors are already logged in the case ofChainMonitor
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I don't think this is a unique case there - anything applies already, if anything this callsite is much less error-prone because the only thing it does is broadcasts our latest transaction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we still have a pathological case when funding_tx is disconnected and we try to force-close the channel with a holder commitment. It won't propagate or confirm. If the latest user balance is substantial, even manual intervention won't solve the issue.
Ideally, as soon as we a counterparty funding transaction we should cache it. If the funding is reorg-out later, we should attach the funding_tx with our holder commitment and why not a high-feerate CPFP. This could be implemented by either
ChannelManager
orChannelMonitor
. Though a bit complex and beyond the scope of this PR...There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, that's definitely a lot for this PR (also because its not a regression - we have the same issue today). That said, I don't really think its worth it - in theory the funding transaction that was just un-confirmed should be in the mempool as nodes re-add disconnected transactions to their mempool. If we want to add a high-feerate CPFP on top of the commitment, that stands alone as a change to ChannelMonitor in handling the
ChannelForceClosed
event.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel the same. Users playing with substantial amount should scale their funding transactions confirmations lock-down high-enough (
minimum_depth
) for this never to happen. For low-conf channels (1-2 blocks), I don't think that's a concern for now.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you don't update
timer_chan_freshness_every_min
name, at least update its documentation to mention the channel monitor update event flush.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, is there a way to put some asserts on
chain_monitor
state that it's the same at the beginning and end of these functions? Seems safer than just leaving comments.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I thought about that, we would have to wrap the entire
chain_monitor
in a wrapper struct that will check a boolean before calling the inner method. I figured it wasn't worth the complexity, but I'm open to other opinions.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does seem to indicate a need to refactor the code such that
chain_monitor
cannot be called in certain scenarios. I don't have a concrete suggestion, however.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we wanted to go there, we could have deserialization build some intermediate object which you can only connect/disconnect blocks on, then you can tell it you're done and get the full
ChannelManager
. Not sure how that would interact with #800, though.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to address now, but as part of #800 we may want to catalog all that
ChannelManager
is "managing" and identify suitable abstractions where possible. :)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment describing this on 800.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, could you retrieve the
ChainMonitor
's list ofChannelMonitor
outpoints and latest update ID's at the beginning of the function, then ensure those are the same at the end of the function?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the note in (2) still accurate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what you mean, do you mean the part below? If so, yes, that behavior is unchanged - we call
broadcast_latest_holder_commitment_txn
directly on the passed monitors instead of relying on an update object.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this really possible. Let's say 1st deserialized ChannelManager force-closes the channel due to some off-chain violations from our counterparty (e.g a HTLC under minimum_msat). Force-close is dutifully sent to
ChannelMonitor
andlockdown_from_offchain
latches to true. Ulteriorly, the 2nd deserialized ChannelManager should receive the same onchain block sequence but effectively not the off-chain one, so it won't close channel again. But any attempt to update channel state should be rejected byChannelMonitor
, assuming it's the same version betweenChannelManager
deserializations.Or do you envision a different scenario ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The force-close doesn't generate a
ChannelMonitorUpdate
event, which implies the user is not required to re-persist theChannelMonitor
. So they could deserialize again with the original. We could change that, but I don't think its worth it.