-
Notifications
You must be signed in to change notification settings - Fork 302
feat(iroh): add initial scaffolding for WebRTC support #3440
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat(iroh): add initial scaffolding for WebRTC support #3440
Conversation
We're discussing this in the iroh discord: https://discord.com/channels/1161119546170687619/1406897550140772393 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've had a quick scroll through and noticed a bunch of println
s and commented out code still. Could you add a note on what the stated of completion of the PR is, what the first round of review should pay attention to?
This PR is still in progress - I’ll clean up the println! statements and remove the commented-out code before finalizing. For the first round of review, I’d like you to mainly focus on whether I am correctly sending the WebRTC offer/ICE candidate from one node to another. Current flow Node A is spawned. Node B connects. Node B uses discovery to send a ping . I refactored the message-sending logic so that send_ping now also carries WebRTC ICE candidate information. Points where I need guidance
For now you can ignore webrtc actor (peer connection state ) as it needs to be changed |
pub direct_addresses: BTreeSet<SocketAddr>, | ||
/// Static Webrtc connection information for the node | ||
pub webrtc_info: Option<WebRtcInfo>, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding:
NodeAddr is used by the discovery system to locate nodes in the network. For discovery purposes, we only need the node_id to find a node. Once found, WebRTC connections are established through the standard offer/answer exchange process.
My concern:
Since WebRTC connections require dynamic offer/answer negotiation anyway, is it necessary to store static WebRTC information like channel_id and webrtc_info (certificate fingerprints) here?
/// convert `&str` and `String` into `UserData` easily. | ||
#[derive(Debug, Clone, Eq, PartialEq, Ord, PartialOrd, Hash)] | ||
pub struct UserData(String); | ||
pub struct UserData(pub String); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs to revert it back. It was a quick fix for compilation erros. #3250
/// | ||
/// The fallback/bootstrap path, if non-zero (non-zero for well-behaved clients). | ||
relay_url: Option<(RelayUrl, PathState)>, | ||
webrtc_channel: Option<(ChannelId, PathState)>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is incorrect. I need more intuition on what node state , what to put it here
pub(super) struct NodeMapInner { | ||
by_node_key: HashMap<NodeId, usize>, | ||
by_ip_port: HashMap<IpPort, usize>, | ||
by_webrtc_port: HashMap<WebRtcPort, usize>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also need guidance on this part. If this is correct or needs changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can ignore this file for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a comprehensive review at all, I'm just trying to orient myself a bit and start build up a mental picture of how WebRTC things work and how you integrated it.
/// the specific channel/connection to that node). | ||
/// | ||
/// This is particularly useful when: | ||
/// - A node needs to maintain multiple WebRTC connections to the same peer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When would we need to maintain multiple WebRTC connections between the same two peers? My mental model was that a single WebRTC connection would transport QUIC datagrams, which can be for any number of QUIC connections. So I wasn't expecting to ever need more than one WebRTC connection between two nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct. I will fix this
/// to the same peer node. It's a 16-bit unsigned integer, allowing for up to 65,536 unique | ||
/// channels per node pair. | ||
/// | ||
/// The channel ID space is managed by the WebRTC implementation and should be: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If managed by the WebRTC implementation why does the example show creating this from a static number? That's confusing to me. Or is it trying to show something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These comments need to be corrected. The main idea is that a port should be uniquely identified by both channel_id and node_id
relay_url: None, | ||
channel_id: None, | ||
direct_addresses: Default::default(), | ||
webrtc_info: None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these two separate fields?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be fixed.
I think we do not need webrtc info and channel id here
n0-watcher = {path = "../../n0-watcher"} | ||
nested_enum_utils = "0.2.1" | ||
netwatch = { version = "0.8" } | ||
netwatch = { path = "../../net-tools/netwatch" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a corresponding net-tools PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, to run locally there were some dependency issue
let Some(msg) = msg else { | ||
trace!("tick: magicsock receiver closed"); | ||
self.msock.metrics.magicsock.actor_tick_other.inc(); | ||
_ = shutdown_token.cancelled() => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please leave the indentation in place. Formatting in macro's is a bit pants, but let's keep it as close as we can to normal formatting. And also try and keep this diff as readable as possible, so we can see just the new branches of the select.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
|
||
let data = WebRtcData { | ||
channel_id: channel_id.clone(), | ||
delivery_mode: WebRtcDeliveryMode::Reliable, // TODO: Make configurable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Datagrams are unreliable. Should we not be sending these all as Unreliable? By sending QUIC datagrams inside a reliable channel we have both QUIC and WebRTC doing buffering and re-ordering. This is generally bad for congestion controllers who will start to misbehave. Yes, the relay channel has this problem too - but here we have the option of doing something better it seems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I understand, in WebRTC with a live P2P connection, poll_recv just returns whatever data arrives on the channel. In the actor, that channel forwards the data straight to the msock layer, and then it shows up in process_datagram.
I think there is no quic ordering happening
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m also a bit confused about how the WebRTC handle should be exposed to the user ,once I have a clearer picture of that, I think I’ll better understand where this QUIC reordering happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is no quic ordering happening
The general model of iroh (for the purpose of this discussion) is that it sends QUIC datagrams over several unreliable transports. Primarily this is UDP, the transport QUIC was originally designed for. So we want to keep our transports as close to the behaviour of UDP as possible.
The user gets a QUIC API to create streams and send user data. The QUIC stack manages all things network when it builds QUIC packets and puts them in datagrams: it handles the rate at which to send them, re-sending if needed, buffering if out of order etc. So all we want from WebRTC is to give us an unreliable datagram transport to put QUIC datagrams on.
So the user never gets to see or interact with WebRTC. It's a transport iroh uses under the hood. A user should be entirely unaware of WebRTC (ideally, they will probably be aware that on native we have to bind a socket for it as I understand). The only thing it enables is that iroh can establish direct QUIC connections when at least one of the nodes is in a web browser.
pub(crate) enum WebRtcActorMessage { | ||
CreateOffer { | ||
local_node: PublicKey, | ||
peer_node: PublicKey, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use the NodeId
type alias when appropriate everywhere.
The docs of NodeId
should give you guidance: https://docs.rs/iroh/latest/iroh/type.NodeId.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
Hi team,
I’ve added the basic struct for WebRTC support to get things started.
Kindly guide me if this is a good direction or if there’s anything I should take care of early on.
Also, what signaling server setup would you recommend for Iroh’s WebRTC integration?
Thanks!