Skip to content

Document consistency and availability assumption of watchtower infrastructure #604

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ariard opened this issue Apr 23, 2020 · 10 comments
Open

Comments

@ariard
Copy link

ariard commented Apr 23, 2020

Getting a fail-safe/highly robust watchtower model is harder than expected. As we encounter issue while moving forward we should keep track of them and draw a consistent model step by step.

See #597 (comment)

TODO: link #watchtower slack discussion on quorum-vs-consensus alternatives.

@ariard
Copy link
Author

ariard commented Aug 21, 2020

One of your monitor instance may crash silently for a while and you might not learn it if you don't update_monitor() due to lack of channel updates ? Your implementation of ManyChannelMonitor might force-close preemptively channel at risk in reaction.

See also #667 (review)

@ariard
Copy link
Author

ariard commented Aug 26, 2020

Watchtower Alice receives block 100, broadcasts state X, rejects state Y.
Watchtower Bob accepts state Y, receives block 100, broadcasts state Y.
State Y confirms onchain. Alice must be able to claim outputs.
State Y is rejected by watchtower coordinator Caroll, secret for state X isn't released.

See also #667 (comment)

@TheBlueMatt
Copy link
Collaborator

Right, I think the only way to solve that pattern is to have some kind of consensus on when to broadcast a transaction - if you can't get a majority of watchtowers to agree to halt updates, then you shouldn't be able to broadcast a transaction as otherwise a majority of watchtowers could revoke the now-broadcast transaction.

@devrandom
Copy link
Member

I think there are two separate animals here.

  • a watchtower is an external untrusted service, which can only publish justice txs. the security property of a watchtower is that if you subscribe to n of them, you will be safe if at least 1 does the right thing

  • a highly available distributed channel-monitor, which can also publish commitment txs. I would consider this as part of the node deployment, because you have to trust it with both privacy and security of the funds. All of the monitor instances must have a correctly working HSM, because they have sensitive keys that can cause loss of all funds in the channel.

I agree with @TheBlueMatt - the HSMs embedded in each channel-monitor must achieve consensus in order to move the state forward (e.g. revoke an old state). This can be achieved with a majority voting scheme.

@ariard
Copy link
Author

ariard commented Aug 31, 2020

I like the distinction, but note that usually LN folks have used private/public watchtower for the trusted-vs-untrusted deployment. Though it hasn't been done with that much rigor. You may delegate running one instance of your distributed channel-monitor to a third-party, you may have out-of-band remedies against them, that's up to you.

That said, I think than less-than-unanimity to move state forward (i.e accept ChannelMonitorUpdateStep::LatestLocalCommitmentTxInfo) is unsafe as otherwise a subset of your monitors may broadcast previous now-revoked states. For accepting remote commitment transaction update, a majority is enough as there is no toxicity involved.

@TheBlueMatt
Copy link
Collaborator

I think after a bunch of back and forth @ariard and I are on a similar page - there are really two supported modes here (or should be) - either you get majority consensus of your monitors for each action (including broadcasting) or you get 100% consensus of your monitors for updates (but any one monitor can broadcast on its own). Still #679 makes it easier to build the second since it avoids the need to do complicated pre-consensus, allowing you to simple apply new updates to monitors and wait until all monitors have verified they've applied the update before moving the channel forward.

@ariard
Copy link
Author

ariard commented Sep 15, 2020

Describe responsibilities between a) off-chain manager b) monitor coordinator c) per-channel monitor.

@ariard
Copy link
Author

ariard commented Sep 25, 2020

For remote watchers, we may have race conditions between learning of a revocation secrets and a counterparty commitment transaction as a former can happen after the latter. Verify or do something.

@ariard
Copy link
Author

ariard commented Oct 3, 2020

Document different internal monitor backup strategies.

See #681 (comment)

@ariard
Copy link
Author

ariard commented Oct 3, 2020

We may have hints that coordinator is buggy or compromised based on our local state. If this happen, we may go onchain to avoid further risks.

See #681 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants