Skip to content

Gateway fails to suggest peer connections due to race condition in join operation #1863

@sanity

Description

@sanity

Problem Description

Gateway fails to forward connection petitions to joining peers because it checks num_connections() == 0 before the joiner's connection has been added to the ring. This prevents peer-to-peer connections from forming in fresh networks, leaving all peers connected only to the gateway in a star topology.

Steps to Reproduce

  1. Start a fresh gateway on v0.1.28
  2. Start 10 peers connecting to the gateway
  3. Wait for all peers to connect
  4. Query network topology

Expected Behavior

Peers should receive suggestions from the gateway to connect to other peers, forming a mesh topology.

Actual Behavior

All peers remain connected only to the gateway. Gateway logs show:

WARN freenet::operations::connect: Couldn't forward connect petition, not enough connections, tx: 01K64C77APXWZKWQ520588WQR0, joiner: v6MWKgqHaDEbNQSL

Root Cause Analysis

In crates/core/src/operations/connect.rs:1028-1034, the gateway checks if it has any connections before forwarding a connect petition:

if connection_manager.num_connections() == 0 {
    tracing::warn!(
        tx = %id,
        joiner = %joiner.peer,
        "Couldn't forward connect petition, not enough connections",
    );
    return Ok(None);
}

However, this check happens during the join operation before the joiner's connection is added to the gateway's ring (which happens in crates/core/src/ring/mod.rs:220). The sequence is:

  1. Peer initiates connection to gateway
  2. Gateway receives join request
  3. Gateway attempts to forward connect petition
  4. Gateway checks num_connections() == 0fails here
  5. (Later) Connection gets added to ring via Adding connection to peer log

Evidence

Gateway logs show peer connections being created but connection suggestions failing:

INFO freenet::transport::peer_connection: PeerConnection created with persistent keep-alive task, remote: 5.9.111.215:40005
INFO freenet_core::transport::keepalive_lifecycle: Keep-alive task STARTED for connection, remote: 5.9.111.215:40005
WARN freenet::operations::connect: Couldn't forward connect petition, not enough connections, tx: 01K64C77APXWZKWQ520588WQR0, joiner: v6MWKgqHaDEbNQSL

Notably absent: Adding connection to peer messages in gateway logs, which appear in peer logs when they successfully join.

Suggested Fixes

Several approaches could resolve this:

  1. Change the threshold: Check num_connections() <= 1 to exclude only the joiner itself
  2. Reorder operations: Add the connection to the ring before attempting to forward connect petitions
  3. Async coordination: Wait for the connection to be added before forwarding, or use a future/promise pattern
  4. Fallback logic: If no existing connections, return the gateway itself as a connection suggestion

Environment

  • Freenet version: v0.1.28
  • Rust version: rustc 1.82.0 (f6e511eec 2024-10-15)
  • OS: Linux
  • Configuration: Fresh network with 1 gateway + 10 peers
  • All peers connecting to gateway at 5.9.111.215:31337

Impact

This bug effectively prevents peer mesh formation in fresh Freenet networks, forcing a star topology where all traffic must flow through the gateway. This significantly impacts:

  • Network resilience (single point of failure)
  • Gateway load (all traffic concentrated)
  • Peer discovery and connectivity
  • Network scalability

Additional Context

This was discovered during v0.1.28 release validation using network visualization tools showing all peer connections terminating at the gateway with zero peer-to-peer connections.

[AI-assisted debugging and comment]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions