Rewrite lightning-net-tokio using async/await and tokio 0.2 #472

TheBlueMatt · 2020-02-01T17:49:05Z

This is a rather major rewrite, using async/await and tokio 0.2,
which cleans up the code a ton as well as adds significantly to
readability.

CC #423.

lightning-net-tokio/Cargo.toml

TheBlueMatt · 2020-02-03T23:49:17Z

Pushed with a few fixes, plus a workaround for tokio-rs/tokio#2213 (not sure whether to call it a bug, will wait for the tokio folks to comment).

TheBlueMatt · 2020-02-14T06:55:51Z

Tokio upstream fixed 2213, so I've reverted the awkward fix for it, though this is now blocked until the next tokio release.

jkczyz

FYI, I'm reviewing this as entirely new code since #339 was not reviewed by anyone.

lightning-net-tokio/src/lib.rs

jkczyz · 2020-02-21T00:02:11Z

lightning-net-tokio/src/lib.rs

 	}

 	/// Process incoming messages and feed outgoing messages on the provided socket generated by
 	/// accepting an incoming connection (by scheduling futures with tokio::spawn).
 	///
 	/// You should poll the Receive end of event_notify and call get_and_clear_pending_events() on
 	/// ChannelManager and ChannelMonitor objects.
-	pub fn setup_inbound<CMH: ChannelMessageHandler + 'static>(peer_manager: Arc<peer_handler::PeerManager<SocketDescriptor<CMH>, Arc<CMH>>>, event_notify: mpsc::Sender<()>, stream: TcpStream) {
-		let (reader, us) = Self::new(event_notify, stream);
+	pub async fn setup_inbound<CMH: ChannelMessageHandler + 'static>(peer_manager: Arc<peer_handler::PeerManager<SocketDescriptor, Arc<CMH>>>, event_notify: mpsc::Sender<()>, stream: TcpStream) {


Trying to understand the design decisions: what's the reasoning behind not returning a Connection? It seems like it could be an ideal abstraction for users to interact with.

Just made Connection non-pub as it didn't make sense to do so. Let me know if its clearer now.

jkczyz · 2020-02-21T00:17:49Z

lightning-net-tokio/src/lib.rs

 pub struct Connection {
-	writer: Option<mpsc::Sender<bytes::Bytes>>,
+	writer: Option<io::WriteHalf<TcpStream>>,
 	event_notify: mpsc::Sender<()>,


Could you explain the purpose of event_notify to me? It isn't readily apparent as to how this is suppose to be used within this module and by the user.

Hopefully the docs on the connection functions are a bit clearer now.

Added a comment on one of those functions with regards to clarification.

The module-level documentation should describe how events are used with the API, including a concise example, as it seems that is the primary interaction users have after creating connections. The explanation should not need to be repeated on all three connection functions.

jkczyz · 2020-02-21T00:46:40Z

lightning-net-tokio/src/lib.rs

 	/// You should poll the Receive end of event_notify and call get_and_clear_pending_events() on
 	/// ChannelManager and ChannelMonitor objects.


This comment seems out of place as it presupposes the reader knows how event_notify is used in the API, which AFAICT is never explained. The context around calling "get_and_clear_pending_events() on ChannelManager and ChannelMonitor objects" is also missing.

jkczyz · 2020-02-21T01:08:46Z

lightning-net-tokio/src/lib.rs

 	/// Process incoming messages and feed outgoing messages on the provided socket generated by
 	/// accepting an incoming connection (by scheduling futures with tokio::spawn).


This sentence is a bit verbose which indicates to me that this may be doing too much. As per my comment on line 179, I'd like to explore whether we can provide an asynchronous API to the user based around a Connection abstraction. But I need to get a better understanding of the design before giving any concrete recommendations with regards to its interface.

jkczyz · 2020-02-21T01:22:12Z

lightning-net-tokio/src/lib.rs


-		(reader, us)
+		(reader, receiver,
+		Arc::new(Mutex::new(Self {


I'm not quite sure I follow why an Arc and a Mutex are needed. Could you describe a scenario for me where a Connection may be accessed concurrently?

jkczyz · 2020-02-21T01:33:44Z

lightning-net-tokio/src/lib.rs

 	}
 }
-impl<CMH: ChannelMessageHandler> peer_handler::SocketDescriptor for SocketDescriptor<CMH> {
+impl peer_handler::SocketDescriptor for SocketDescriptor {


Could you provide a high-level description of this implementation?

Just wanted to ping this comment. Trying to get a grasp of how send_data works, specifically wrt how Waker and RawWaker fit in. Feel free to point to a reference.

I totally missed this, gotta love the GitHub UI. The only way I know how to surface old comments again is to leave a response as a part of a review, then GH will show that response duplicatively but at the end of the discussion.

In general, Rust futures (which you can mentally model as green threads) work by calling a poll function, which either makes progress or it doesn't. To know when to poll, you provide a waker which receives a call when the runtime (in this case Tokio) thinks polling makes sense now (ie because we have more room in the socket write buffer, as returned by select() etc). Here, we use that to wake the read/process "thread" and then it calls write_buffer_space_avail() which will call send_data() and we'll end up back at poll().

Thanks for explaining. I have a high-level understanding of how this works, and I don't have a problem following what send_data is doing.

The only open question I have is: why are RawWaker, RawWakerVTable, and the unsafe code needed? I couldn't find them in the async book, which instead demonstrates creating a Waker by implementing the ArcWake trait. But I don't know if that's applicable in this case.

No need to hold the review up on this, but curious for my own understanding of why one approach is chosen over another.

It looks like the ArcWake trait is just a utility to do essentially what we're doing, but provided by the futures crate, which we don't depend on here.

Sadly, the way Futures were stabilized upstream is pretty bare-bones. The only way to implement your own future is with all the unsafe code and raw function pointers you see here, or via the async keyword on a function.

TheBlueMatt · 2020-02-21T21:15:56Z

I added a bunch of docs that should clarify things a little bit, let me know if its better now or where I should shore it up.

jkczyz · 2020-02-22T01:14:13Z

lightning-net-tokio/src/lib.rs

+/// You should poll the Receive end of event_notify and call get_and_clear_pending_events() on
+/// ChannelManager and ChannelMonitor objects after each receive event. In general such an event
+/// should occur after nearly every read event, as any read event may trigger an event which must
+/// be handled.


I'm a little confused by the uses of "event" in this documentation. Specifically, I'm having a hard time differentiating between "read event", "receive event", and the more generic "event" (e.g., "as any read event may trigger an event"). Is the term "event" being overloaded here to refer to two different things?

jkczyz · 2020-02-22T01:22:42Z

lightning-net-tokio/src/lib.rs

 pub struct Connection {
-	writer: Option<mpsc::Sender<bytes::Bytes>>,
+	writer: Option<io::WriteHalf<TcpStream>>,
 	event_notify: mpsc::Sender<()>,


Added a comment on one of those functions with regards to clarification.

The module-level documentation should describe how events are used with the API, including a concise example, as it seems that is the primary interaction users have after creating connections. The explanation should not need to be repeated on all three connection functions.

TheBlueMatt · 2020-02-23T22:35:03Z

Added an example and some more links in the Events module doc, hopefully thats a bit clearer, especially since its, indeed, somewhat overloaded.

jkczyz

Added an example and some more links in the Events module doc, hopefully thats a bit clearer, especially since its, indeed, somewhat overloaded.

Let me make sure I understand how the term had been overloaded. Lightning events are generated by the PeerManager via ChannelManager when messages are received and also by ChannelMonitor when blocks are connected. This is one use of "events".

Then in our tokio module, the unit type () is used as an event by passing it through an mpsc::channel in order to notify that the aforementioned Lightning events have been received. This "receive event" indicates that the ChannelManager and ChannelMonitor should be queried for those Lightning events.

Is that correct? If so, could these concepts be combined? That is, could we instead have the tokio module send the Lightning events over the mpsc::channel? This seems like exact use case that the mpsc::channel abstraction was designed for.

lightning-net-tokio/src/lib.rs

TheBlueMatt · 2020-02-24T21:48:39Z

Let me make sure I understand how the term had been overloaded. Lightning events are generated by the PeerManager via ChannelManager when messages are received and also by ChannelMonitor when blocks are connected. This is one use of "events".

Yep!

Then in our tokio module, the unit type () is used as an event by passing it through an mpsc::channel in order to notify that the aforementioned Lightning events have been received

Err, not received, only that "something has happened, this something may or may not have caused some of the aforementioned events to have been created".

This "receive event" indicates that the ChannelManager and ChannelMonitor should be queried for those Lightning events.

Yep!

Is that correct? If so, could these concepts be combined? That is, could we instead have the tokio module send the Lightning events over the mpsc::channel? This seems like exact use case that the mpsc::channel abstraction was designed for.

Hmm, would be somewhat awkward, IMO. We handle the () here: https://github.com/TheBlueMatt/rust-lightning-bitcoinrpc/blob/master/src/main.rs#L123 and we write to it in a number of other places (not just the tokio net stuff) including eg when the user asks us to send a payment.

More importantly, we started in #474 (and I have a pending commit or two, not yet PR'd to do the same for events) making Events less easy to miss by serializing and crashing/shutting down before having handled all the in-user-buffers Events (by moving them to our own buffers and serializing them).

valentinewallace

Initial pass. The recently added comments have been helpful!

lightning-net-tokio/src/lib.rs

valentinewallace · 2020-02-24T20:53:38Z

lightning-net-tokio/src/lib.rs

-	fn schedule_read<CMH: ChannelMessageHandler + 'static>(peer_manager: Arc<peer_handler::PeerManager<SocketDescriptor<CMH>, Arc<CMH>>>, us: Arc<Mutex<Self>>, reader: futures::stream::SplitStream<tokio_codec::Framed<TcpStream, tokio_codec::BytesCodec>>) {
-		let us_ref = us.clone();
-		let us_close_ref = us.clone();
+	async fn schedule_read<CMH: ChannelMessageHandler + 'static>(peer_manager: Arc<peer_handler::PeerManager<SocketDescriptor, Arc<CMH>>>, us: Arc<Mutex<Self>>, mut reader: io::ReadHalf<TcpStream>, mut write_event: mpsc::Receiver<()>) {


IIUC, write_event receives write_events (I assume from the remote peer) --- maybe a better name would be write_event_receiver?

Done, and realized while I was at it that some comments still referred to PeerManager::write_event, which doesn't exist anymore (is PeerManager::write_buffer_space_avail).

lightning-net-tokio/src/lib.rs

valentinewallace · 2020-02-24T21:35:49Z

lightning-net-tokio/src/lib.rs

+			tokio::select! {
+				v = write_event.recv() => select_write_ev!(v),
+				read = reader.read(&mut buf) => match read {
+					Ok(0) => {


Hm, so reading 0 bytes means the connection is closed, because it'll just block if the connection is still open but there's nothing to read?

Yep! Its pretty standard for socket APIs to do that.

valentinewallace · 2020-02-24T21:51:56Z

lightning-net-tokio/src/lib.rs

-	pub fn setup_inbound<CMH: ChannelMessageHandler + 'static>(peer_manager: Arc<peer_handler::PeerManager<SocketDescriptor<CMH>, Arc<CMH>>>, event_notify: mpsc::Sender<()>, stream: TcpStream) {
-		let (reader, us) = Self::new(event_notify, stream);
+/// Process incoming messages and feed outgoing messages on the provided socket generated by
+/// accepting an incoming connection (by scheduling futures with tokio::spawn).


Got confused by (by scheduling futures with tokio::spawn). Does this make sense as a replacement:

s/(by scheduling futures with tokio::spawn)/(by giving the PeerManager a socket to write peer data out on, scheduling data reads from the same socket, and transmitting said data to the PeerManager)/

Hmm, actually I intended to inform the user that, via tokio::spawn, things would run free. I clarified a bit.

valentinewallace · 2020-02-24T22:02:40Z

lightning-net-tokio/src/lib.rs

+	// If we get disconnected via SocketDescriptor::disconnect_socket(), we don't call
+	// disconnect_event(), but if we get an Err return value out of PeerManager, in general, we do.
+	// We track here whether we'll need to call disconnect_event() after the socket closes.
+	need_disconnect_event: bool,


Wish there was a way to get rid of this variable, feels a bit hacky.

I kept having to re-reference what it means, so maybe rename more explicitly to something like "call_disconnect_event_on_shutdown"?

Sure, renamed. Honestly after all the back and forth on it and bugs and whatnot I think I agree - we should just always require one, though the handling inside PeerHandler will likely get a bunch more complicated, so I don't know if it makes sense to run forward and do it ASAP.

Doesn't seem to be renamed? Not a huge deal tho.

we should just always require one, though the handling inside PeerHandler will likely get a bunch more complicated, so I don't know if it makes sense to run forward and do it ASAP.

Cool, might file an issue for that then!

jkczyz

Err, not received, only that "something has happened, this something may or may not have caused some of the aforementioned events to have been created".

For the case where the "something" is not one of the aforementioned events, what's the use of notifying? Could you give me an example to help understand?

Hmm, would be somewhat awkward, IMO. We handle the () here: https://github.com/TheBlueMatt/rust-lightning-bitcoinrpc/blob/master/src/main.rs#L123 and we write to it in a number of other places (not just the tokio net stuff) including eg when the user asks us to send a payment.

Could you add the write use case to the documentation? AFAICT, this usage is not mentioned anywhere. Specifically, it seems like the user is required to notify whenever they use the ChannelManager in some manner.

This is an excellent opportunity to add (small) tests demonstrating each behavior of the library.

jkczyz · 2020-02-24T23:34:24Z

lightning-net-tokio/src/lib.rs

+//! TcpStream and a reference to a PeerManager and the rest is handled", with the exception of the
+//! act of reading bytes from the remote peer possibly triggering the need to handle some
+//! [Events](../lightning/util/events/enum.Event.html).


The exception here seems less of an exception and more of the primary way of interacting with the API. I would recommend rewording this to indicate as such.

Hmm, that was meant as a "the only thing that isnt as trivial is Event handling". Hope its clearer now.

TheBlueMatt · 2020-02-25T03:05:56Z

For the case where the "something" is not one of the aforementioned events, what's the use of notifying? Could you give me an example to help understand?

Right, sorry, from lightning_net_tokio its always either bytes-read-from-socket or socket-disconnected, though in theory it could be others (and is used generically across the rust-lightning-bitcoinrpc sample app).

Could you add the write use case to the documentation? AFAICT, this usage is not mentioned anywhere. Specifically, it seems like the user is required to notify whenever they use the ChannelManager in some manner.

Hmm, I think maybe I miscommunicated a bit here - within rust-lightning-bitcoinrpc we use the same Sender as we pass into lightning-net-tokio to handle things like "we've sent money, we should go poke the PeerHandler to see if it wants to send any data" (which of course ends up calling send_data within the SocketDescriptor), but that is somewhat unrelated to its use in lightning_net_tokio.

valentinewallace

I'd second Jeff's comment:

This is an excellent opportunity to add (small) tests demonstrating each behavior of the library.

But if that comment were addressed, though I wouldn't say I'm thrilled with the parsability (mostly w/r/t some of the weird bools like need_disconnect_event, disconnect_block, disconnect) I'd be okay with revisiting and trying to simplify later.

lightning-net-tokio/src/lib.rs

valentinewallace · 2020-02-26T01:32:16Z

lightning-net-tokio/src/lib.rs

-				// Ignore full errors as we just need them to poll after this point, so if the user
-				// hasn't received the last send yet, it doesn't matter.
-				assert!(e.is_full());
+			// Whenever we want to block on reading or waiting for reading to resume, we have to


nit: s/or waiting/or are waiting

I dont think so - "we want to block on reading or block on waiting" is the de-sugaring.

lightning-net-tokio/src/lib.rs

TheBlueMatt · 2020-02-28T16:57:22Z

I did find one more thing to test - running without a many-threaded environment has different behavior the first time you go to write bytes. This is a neat test in that it also exercises our write-blocking and RawWakerVTable logic.

jkczyz · 2020-02-29T02:26:01Z

lightning-net-tokio/src/lib.rs

+			tokio::select! {
+				v = write_avail_receiver.recv() => select_write_ev!(v),
+				read = reader.read(&mut buf) => match read {
+					Ok(0) => {
+						println!("Connection closed");
+						break;
+					},
+					Ok(len) => {
+						if let Some(blocker) = {
+							let mut lock = us.lock().unwrap();
+							if lock.disconnect_state == DisconnectionState::RLTriggeredDisconnect {
+								shutdown_socket!("disconnect_socket() call from RL");
+							}
+							if lock.read_paused {
+								let (sender, blocker) = oneshot::channel();
+								lock.read_blocker = Some(sender);
+								Some(blocker)
+							} else { None }
+						} {
+							tokio::select! {
+								res = blocker => {
+									res.unwrap(); // We should never drop the sender without sending () into it!
+									if us.lock().unwrap().disconnect_state == DisconnectionState::RLTriggeredDisconnect {
+										shutdown_socket!("disconnect_socket() call from RL");
+									}
+								},
+								v = write_avail_receiver.recv() => select_write_ev!(v),
+							}
+						}


The logic around paused reads with the nested select! is a bit difficult to follow. I wonder if this could be simplified a bit using a precondition on the read branch.

let read_paused = us.lock().unwrap().read_paused; tokio::select! { v = write_avail_receiver.recv() => { // ... }, read = reader.read(&mut buf), if !read_paused => match read { // ... }, }

Then I think this would eliminate the need for the blocker and thus the nested select!. Instead, we would write until send_data indicates to resume reading, in which case the read branch would be reenabled.

Hmm, I'm not convinced that select!{} definitely wont call the read() future if !read_paused (instead of blocking on the future and never taking that branch). I'd buy that it likely wont, but there's sadly way too little documentation on it.

In reading the documentation, I realize my comment was definitely not right, as the docs explicitly call out this case and say that the future wont be polled. That said, I don't think it eliminates the need for the blocker - we still need a way to wake up the reader in case we got a write with resume_read set.

I initially tried to rewrite it with just one Sender which could wake the read task, but sadly we need to make sure that if we wake it due to a write_avail event, we always call write_buff_space_avail. This would mean either the channel has an unlimited depth and we scan it for any write_avail events, or we add yet another bool to indicate that write_avail needs to be called. I find both worse than keeping the current model of just having two Senders, but I removed the Option<> and simplified the logic a ton.

Didn't notice your reply before commenting, so deleted it. Looks much simpler now!

Ah, with regard to the blocker, isn't writing the only way to resume reading? That is, we can't resume reading until send_data is called with resume_read set true. That was why I thought the blocker was not needed. I haven't fully reviewed the send_data implementation, so I could be missing something.

TheBlueMatt · 2020-03-01T04:26:06Z

I saw the issue in tokio 2213 again, but I think now I realize they can't possibly fix it as it would break other things (hence, NOTABUG, I guess) - if you have pending writes frozen and then you go to write again, it doesn't make sense for the new write to complete before the pending write(s) get woken up to write their data first. Thus, it becomes a race between the select() thread waking up the pending writes before the new writers make it into some checks (unless they cache the writeable state, which also kinda probably sucks). Anyway, turns out its in fact not only easy to work around but makes a few things probably cheaper than they were to begin with, so no harm in keeping it anyway.

jkczyz · 2020-03-02T20:35:48Z

lightning-net-tokio/src/lib.rs

+			macro_rules! prepare_read_write_call {
+				() => { {
+					let mut us_lock = us.lock().unwrap();
+					if us_lock.disconnect_state == DisconnectionState::RLTriggeredDisconnect {
+						shutdown_socket!("disconnect_socket() call from RL");
 					}
-				},
-				Err(e) => {
-					us_ref.lock().unwrap().need_disconnect = false;
-					return future::Either::B(future::result(Err(std::io::Error::new(std::io::ErrorKind::InvalidData, e))));
-				}
+					us_lock.block_disconnect_socket = true;
+				} }
 			}


Rather than using a macro and needing to set block_disconnect_socket false in multiple places (because of breaks), could you make a simple RAII struct (DisconnectBlocker?) for managing this state?

Then prepare_read_write_call could be renamed to something less ambiguous like check_for_shutdown, if still needed (see comment on disconnect check).

Using an RAII struct seems hard, see below, as we need to check the disconnect state in the same lock as setting block_disconnect.

Could block_disconnect_socket just be a lock? It seems to be functioning as one. Then you would get the RAII behavior for free.

Hmm, this seemed like a great idea, but once I tried to start writing it I realized we'd need to be able to hold a reference to that lock without the outer lock (as we can't hold the outer lock while calling into write_buffer_space_avail), which would imply a second Arc which seems super wasteful. If we ever end up with CPU issues on the spinlock I'd be somewhat surprised, so for now I'd say leave it.

Will defer to you on performance. From a code complexity perspective, if not using a mutex I'd then recommend:

moving us_lock.block_disconnect_socket = true; out of prepare_read_write_call to avoid obfuscating the connection with us_lock.block_disconnect_socket = false;

rename prepare_read_write_call to something more meaningful like shutdown_socket_if_requested

lightning-net-tokio/src/lib.rs

jkczyz · 2020-03-02T21:34:58Z

lightning-net-tokio/src/lib.rs

+			// Happy-path return:
+			if !us.block_disconnect_socket { return; }


Isn't this handled by the following loop?

Not without an extra lock().

jkczyz · 2020-03-02T22:22:43Z

lightning-net-tokio/src/lib.rs

+					if us_lock.disconnect_state == DisconnectionState::RLTriggeredDisconnect {
+						shutdown_socket!("disconnect_socket() call from RL");
 					}


I'm not entirely in love with how disconnect_state is used. It seems to have two purposes:

signaling that SocketDescriptor::disconnect_socket was called

signaling a manner in which the select loop has exited

Could (1) be communicated via a channel? Then a branch could be added to the select! for handling disconnection. Or would this be problematic if another branch is chosen even if a disconnect has been signaled?

Maybe a channel would be too much, but just wanted to note my concern. See also my other comment on breaking from the loop.

The issue is that we have to check disconnect_state in the same lock where setting block_disconnect_socket to meet the requirement that "after disconnect_socket() returns, no future calls into read_event or write_buffer_space_avail are made". I'm not sure how to accomplish that with a channel. I could go back to having two bools - one to indicate disconnection should occur and one to indicate that we need to call socket_disconnected. You do, however, point out a simplification here in that we can keep the need_disconnect_event bool in the schedule_read stack.

jkczyz · 2020-03-02T22:57:31Z

lightning-net-tokio/src/lib.rs

+			macro_rules! shutdown_socket {
+				($err: expr) => { {
+					println!("Disconnecting peer due to {}!", $err);
+					break;
+				} }


It's strange using a macro for this but not for the cases where the connection has been closed based on the result of reader.read(). I understand in the shutdown_socket case it is to reduce code duplication, but I don't like the inconsistency in how the loop break is obscured behind a macro in some cases but not others.

Perhaps a cleaner way to organize the code would be to encapsulate tokio::select! in a method that returned an enum indicating the possible outcomes. Something like Result<(), DisconnectReason>.

enum DisconnectReason { ConnectionClosed(String), PeerDisconnected(String), }

Then you can have a straight-line match to handle the possible cases explicitly, which I think would be more readable. You may not even need disconnect_state or at least could reduce its purpose to signaling from SocketDescriptor::disconnect_socket.

Let me know what you think now. I agree it should at least all use the macro, so I went ahead and did that. Further, if you try to break; the loop direct now you'll get an uninit'd variable (as the need_disconnect_event field is now on the stack). I can make it a separate function but I'm not a huge fan of an fn for the read loop and then cleanup going into a different one. A different thing we could do is give the loop a return value (yes, rust is crazy, you can do that) so that you have to provide the need_disconnect_event field in the break; call.

Definitely looking better modulo my comment on block_disconnect_socket.

I kinda prefer if the loop returned a variable actually. Additionally, I think the uses of shutdown_socket would read better with an enum reason similar to the one given in my earlier comment only without the string. Otherwise, it's not clear what the true and false literals represent without looking at the definition.

Ok, I made the loop return a local enum, though I fear we've gotten into diminishing returns at this point.

codecov · 2020-03-04T18:57:15Z

Codecov Report

Merging #472 into master will increase coverage by 0.05%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master     #472      +/-   ##
==========================================
+ Coverage   90.07%   90.13%   +0.05%     
==========================================
  Files          34       34              
  Lines       19053    19052       -1     
==========================================
+ Hits        17162    17172      +10     
+ Misses       1891     1880      -11

Impacted Files	Coverage Δ
lightning/src/ln/peer_handler.rs	`48.7% <100%> (ø)`	⬆️
lightning/src/ln/chanmon_update_fail_tests.rs	`97.32% <0%> (-0.01%)`	⬇️
lightning/src/ln/functional_tests.rs	`96.52% <0%> (+0.23%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 83c9eb4...174ccf5. Read the comment docs.

lightning-net-tokio/src/lib.rs

jkczyz · 2020-03-05T20:15:06Z

lightning-net-tokio/src/lib.rs

+		// If Rust-Lightning tells us to disconnect, we don't, then, call socket_disconnect() after
+		// closing the socket, but if the socket closes on its own (ie our counterparty does it), we do.


I'm extremely confused by this documentation. I'm trying to correspond it to the enum variants and how they are used in the code but am failing.

If Rust-Lightning tells us to disconnect, we don't, then, call socket_disconnect() after closing the socket

Is this saying we don't disconnect? But then says we close the socket. Isn't closing the socket disconnecting? What is meant by "disconnect" here?

Does this correspond to the PeerDisconnected variant? Seems odd to have this sentence first when that variant is second. I'd recommend just putting the appropriate documentation on each variant. Then have the enum documentation itself say what the enum is used for not how it is used.

but if the socket closes on its own (ie our counterparty does it), we do.

Does this correspond to the CloseConnection variant? Isn't that when read returns 0 or an error? But in those cases, PeerDisconnected is used in the code.

I also read it as the imperative "close connection" which would make sense for "we do" if I understood what that is referring to. :P

Comment better now? I reworded it to include the enum in the text so that hopefully its clearer. I think we're way past the point where this should be merged and we can do more cleanups in followons if there's still confusion.

Comment better now? I reworded it to include the enum in the text so that hopefully its clearer.

Ok, after a lot of staring and conferring with Val, the confusing part is , then,. When reading this:

If Rust-Lightning tells us to disconnect, we don't, then, call socket_disconnect() after closing the socket

I took this as, when Rust-Lighting tells us to disconnect:

We don't disconnect

Then we call socket_disconnected()

But only after closing the socket

Which was confusing because it sounded like you were saying we don't disconnect but we close the socket.

However, I can now see the , then, is a manner of speaking (in particular, your manner of speaking) which IMHO does not come across well when reading it. Given that I now recognize this (both from experience over chat and in person) but still couldn't parse the comment correctly, it indicates others may have the same problem.

This is why I tend to request very succinct writing, cutting out unnecessary verbiage; it avoids needless confusion.

In this case, regardless of rewording, I think that instead of repeating how the code works the documentation should say what the enum is used for. FWIW, I'd recommend something like this:

// The manner in which the socket was disconnected. enum Disconnect { // Indicates Rust Lightning disconnected the socket as a result of a PeerHandleError. CloseConnection, // Indicates the peer disconnected the socket and thus Rust Lightning should be notified via socket_disconnected(). PeerDisconnected, }

Irrespective of whether the above wording is accurate (I'm not saying it is), writing a comment like this may clarify the language as there may be ambiguity between "close", "disconnect", "socket", and "connection". Which could also result in more accurate enum variant naming.

I'd also like to use this opportunity to make a request regarding code reviews. When I'm generally confused by something and ask a question for clarification, I'm not necessarily requesting that a change be made. I'm really just looking for an answer to the question. And from that I may suggest a specific change that I think would help remove confusion.

This is particularly the case with regards to comments. That is much more efficient than rewriting something and asking, "How about now?"

I think we're way past the point where this should be merged and we can do more cleanups in followons if there's still confusion.

First off, thank you for reaching out to me on Slack saying to not take this the wrong way. I genuinely appreciate you doing that.

I just want to note that reviewers have an important role to play in determining whether a change is merged. And any confusion should err on the side of the reviewers given they are looking at the code largely from the perspective of a reader rather than the writer.

I do agree that schedule_read looks to be in pretty good shape. That said, it is only part of the code under review. I did leave a comment about send_data asking for guidance in reviewing it. I would like to review that before merging so would appreciate a follow up there.

When I'm generally confused by something and ask a question for clarification, I'm not necessarily requesting that a change be made

Hmm, fair, ok. I generally air on the side of "answer by updating the comment" as it usually implies less likelihood of people being confused down the road, case someone else is likely to have the same question.

As for review process in general, I agree, only noting that we don't have the resources to block things for as long as we are. That's probably an indication of a problem different from reviewers taking too long (documentation not up to par, for example), of course. In this specific case, probably the documentation on PeerHandler itself needs to be the focus of the improvements, though we can do that in a separate PR.

valentinewallace · 2020-03-06T19:50:37Z

lightning-net-tokio/src/lib.rs

+		// If Rust-Lightning tells us to disconnect (ie Disconnect::CloseConnection), we don't, then,
+		// call socket_disconnected() (informing Rust-Lightning that we're disconnecting) after
+		// closing the socket, but if the socket closes on its own (ie our counterparty closes the
+		// connection and we set disconnect_type to PeerDisconnected), we do.


This confused me at first too, I suggest this phasing --

If Rust-Lightning tells us to disconnect (i.e. through Disconnect::CloseConnection), then we close the socket but do not call PeerManager's socket_disconnected() (which informs Rust-Lightning that we're disconnecting). But, if the socket closes on its own, (i.e. if our counterparty closes the connection, causing us to set disconnect_type to PeerDisconnected), we do call PeerManager's socket_disconnected().

I took Jeff's suggested wording by mixing it into the enum. Is that clearer now?

Great wording now!

It looks like we don't currently use the Vec as a Vec, and can happily take a slice, which makes things easier on the calling side.

jkczyz

Left a few comments around clarity in the implementation comments. Otherwise, LGTM.

lightning-net-tokio/src/lib.rs

jkczyz · 2020-03-10T23:11:44Z

lightning-net-tokio/src/lib.rs

 	}
 }
-impl<CMH: ChannelMessageHandler> peer_handler::SocketDescriptor for SocketDescriptor<CMH> {
+impl peer_handler::SocketDescriptor for SocketDescriptor {


Thanks for explaining. I have a high-level understanding of how this works, and I don't have a problem following what send_data is doing.

The only open question I have is: why are RawWaker, RawWakerVTable, and the unsafe code needed? I couldn't find them in the async book, which instead demonstrates creating a Waker by implementing the ArcWake trait. But I don't know if that's applicable in this case.

No need to hold the review up on this, but curious for my own understanding of why one approach is chosen over another.

jkczyz · 2020-03-11T05:05:33Z

lightning-net-tokio/src/lib.rs

+		tokio::time::timeout(Duration::from_secs(10), a_connected.recv()).await.unwrap();
+		tokio::time::timeout(Duration::from_secs(1), b_connected.recv()).await.unwrap();


Any reason the timeouts are different for these? Same below.

Because by the time the connection has done the handshake and disconnected from one end, the other end should be essentially already disconnected.

This is a rather major rewrite, using async/await and tokio 0.2, which cleans up the code a ton as well as adds significantly to readability.

If rust-lightning tells us to disconnect a socket after we read some bytes from the socket, but before we actually give those bytes to rust-lightning, we may end up calling rust-lightning with a Descriptor that isn't registered anymore. Sadly, there really isn't a good way to solve this, and it should be a pretty quick event, so we just busy-wait.

TheBlueMatt · 2020-03-11T17:41:53Z

Happy to take any followup comments in more PRs, so feel free to continue review if anyone isn't happy with where this is, but Jeff's ACK is good enough for me.

TheBlueMatt force-pushed the 2020-01-net-async-await branch 3 times, most recently from b34f482 to f0178a4 Compare February 1, 2020 22:46

jeandudey reviewed Feb 3, 2020

View reviewed changes

lightning-net-tokio/Cargo.toml Outdated Show resolved Hide resolved

TheBlueMatt force-pushed the 2020-01-net-async-await branch from f0178a4 to 2dd2273 Compare February 3, 2020 23:48

TheBlueMatt force-pushed the 2020-01-net-async-await branch from 2dd2273 to 6bc40a0 Compare February 14, 2020 06:55

TheBlueMatt added this to the 0.0.10 milestone Feb 17, 2020

lightningdevkit deleted a comment Feb 18, 2020

valentinewallace self-requested a review February 19, 2020 20:31

TheBlueMatt requested a review from jkczyz February 19, 2020 21:48

TheBlueMatt mentioned this pull request Feb 20, 2020

Fix incorrect docs/disconnect handling in peer_handler #512

Merged

jkczyz reviewed Feb 21, 2020

View reviewed changes

TheBlueMatt force-pushed the 2020-01-net-async-await branch from 6bc40a0 to d115e42 Compare February 21, 2020 21:05

jkczyz reviewed Feb 22, 2020

View reviewed changes

TheBlueMatt force-pushed the 2020-01-net-async-await branch from d115e42 to 725bb3e Compare February 23, 2020 22:34

jkczyz reviewed Feb 24, 2020

View reviewed changes

lightning-net-tokio/src/lib.rs Outdated Show resolved Hide resolved

TheBlueMatt force-pushed the 2020-01-net-async-await branch from 725bb3e to 66e6066 Compare February 24, 2020 21:54

valentinewallace reviewed Feb 24, 2020

View reviewed changes

jkczyz reviewed Feb 24, 2020

View reviewed changes

TheBlueMatt force-pushed the 2020-01-net-async-await branch from 66e6066 to 6f5731d Compare February 25, 2020 03:13

valentinewallace reviewed Feb 26, 2020

View reviewed changes

TheBlueMatt removed this from the 0.0.10 milestone Feb 26, 2020

TheBlueMatt force-pushed the 2020-01-net-async-await branch from 6edd4ca to 2e0ea18 Compare February 28, 2020 19:21

jkczyz reviewed Feb 29, 2020

View reviewed changes

TheBlueMatt force-pushed the 2020-01-net-async-await branch from 2e0ea18 to 829e176 Compare March 1, 2020 04:21

jkczyz reviewed Mar 2, 2020

View reviewed changes

TheBlueMatt force-pushed the 2020-01-net-async-await branch 7 times, most recently from 96f70e9 to 02d4c8a Compare March 4, 2020 18:57

TheBlueMatt force-pushed the 2020-01-net-async-await branch from 02d4c8a to 87f7ad5 Compare March 5, 2020 03:04

jkczyz reviewed Mar 5, 2020

View reviewed changes

TheBlueMatt force-pushed the 2020-01-net-async-await branch from 87f7ad5 to ca3c8d9 Compare March 5, 2020 23:28

valentinewallace reviewed Mar 6, 2020

View reviewed changes

TheBlueMatt mentioned this pull request Mar 9, 2020

Require static_remotekey #539

Merged

Swap read_event read type for a slice isntead of a Vec

6f06858

It looks like we don't currently use the Vec as a Vec, and can happily take a slice, which makes things easier on the calling side.

TheBlueMatt force-pushed the 2020-01-net-async-await branch from ca3c8d9 to 3f5821b Compare March 10, 2020 16:34

jkczyz approved these changes Mar 11, 2020

View reviewed changes

TheBlueMatt added 4 commits March 11, 2020 12:19

Rewrite lightning-net-tokio using async/await and tokio 0.2

5ada940

This is a rather major rewrite, using async/await and tokio 0.2, which cleans up the code a ton as well as adds significantly to readability.

Fix (and test) net-tokio outbound conns without a threaded env

06d3b20

Fix typo in lightning-net-tokio crate description.

174ccf5

TheBlueMatt force-pushed the 2020-01-net-async-await branch from 3f5821b to 174ccf5 Compare March 11, 2020 16:20

TheBlueMatt merged commit d27e9e1 into lightningdevkit:master Mar 11, 2020

tnull mentioned this pull request Sep 6, 2023

Refactor and Purpose of lightning-net-tokio #423

Closed

		/// You should poll the Receive end of event_notify and call get_and_clear_pending_events() on
		/// ChannelManager and ChannelMonitor objects.

		/// Process incoming messages and feed outgoing messages on the provided socket generated by
		/// accepting an incoming connection (by scheduling futures with tokio::spawn).

		// Happy-path return:
		if !us.block_disconnect_socket { return; }

		// If Rust-Lightning tells us to disconnect, we don't, then, call socket_disconnect() after
		// closing the socket, but if the socket closes on its own (ie our counterparty does it), we do.

		tokio::time::timeout(Duration::from_secs(10), a_connected.recv()).await.unwrap();
		tokio::time::timeout(Duration::from_secs(1), b_connected.recv()).await.unwrap();

Rewrite lightning-net-tokio using async/await and tokio 0.2 #472

Rewrite lightning-net-tokio using async/await and tokio 0.2 #472

Conversation

TheBlueMatt commented Feb 1, 2020

Uh oh!

Uh oh!

TheBlueMatt commented Feb 3, 2020

Uh oh!

TheBlueMatt commented Feb 14, 2020

Uh oh!

jkczyz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TheBlueMatt commented Feb 21, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TheBlueMatt commented Feb 23, 2020

Uh oh!

jkczyz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

TheBlueMatt commented Feb 24, 2020

Uh oh!

valentinewallace left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

valentinewallace left a comment •

edited

Loading