Skip to content

Fix incorrect TCP connections. #12259

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

bosilca
Copy link
Member

@bosilca bosilca commented Jan 21, 2024

If nodes have the same IP addresses (for containers or other purposes) and these addresses get published as part of the modex, a remote peer might try to use one of the addresses to connect. As both nodes have the same IP, there are several cases:

  • the "remote" port is not used by an OMPI process locally, the connection is refused or it timeouts. This is the "nicest" outcome, as a new IP will be used resulting in a successful connection and the continuation of the application.
  • the "remote" port is used by another OMPI process on the local node. A connection will be established but the incorrect guid will be exchanged leading to complaints, connection dropped and/or deadlocks.
  • the "remote" port is used by this process, basically resulting in a connection-to-self. Bad things happen, as we don't support TCP connections to self. Some output messages are generated, but the outcome is most likely a deadlock.

Up to now, users were expected to exclude such interfaces from the accepted interfaces, but this patch removes this need. If we discover a local IP as part of the IP list of a remote peer, we drop it and never try to use it. This does not apply to local processes, so we can still use these interfaces for node level communications (which will work as we will connect to the correct port according to the destination process).

@bosilca
Copy link
Member Author

bosilca commented Jan 21, 2024

This patch improves #12232 but does not solve it.

@bosilca
Copy link
Member Author

bosilca commented Mar 5, 2024

Can I get a review on this please.

Copy link
Contributor

@devreal devreal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, had a pending review that I never submitted...

Comment on lines +1507 to +1509
opal_show_help("help-mpi-btl-tcp.txt", "server cannot accept connection from self", true,
peer, OPAL_NAME_PRINT(guid),
opal_process_info.nodename, getpid());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a recoverable case so do we always want to print this message?

We print a message for this when we handle the connection: https://github.com/open-mpi/ompi/pull/12259/files#diff-cc4089aecdcf0cb98dd15472b23f1a3f2af00357a180f986cdba0c89738472dcR462

Copy link
Member Author

@bosilca bosilca Mar 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can handle with this patch, but it is a misconfiguration and the user should fix in future runs. The other side is a verbose output, it will only show when the BTL verbosity is set to 20.

Comment on lines 794 to 801
char tmp[2][16];
inet_ntop(AF_INET, &((struct sockaddr_in *)&btl_endpoint->endpoint_btl->tcp_ifaddr)->sin_addr, tmp[0], 16);
inet_ntop(AF_INET, &((struct sockaddr_in *)&endpoint_addr)->sin_addr, tmp[1], 16);
opal_output(0, "proc %s bind socket to %s:%d before connecting to peer %s at %s:%d\n",
OPAL_NAME_PRINT(OPAL_PROC_MY_NAME),
tmp[0], htons(((struct sockaddr_in *) &btl_endpoint->endpoint_btl->tcp_ifaddr)->sin_port),
OPAL_NAME_PRINT(btl_endpoint->endpoint_proc->proc_opal->proc_name),
tmp[1], ntohs(((struct sockaddr_in *) &endpoint_addr)->sin_port));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably lower the verbosity? (assuming 0 prints always?)

#if OPAL_ENABLE_IPV6
else if (MCA_BTL_TCP_AF_INET6 == remote_addrs[i].addr_family &&
AF_INET6 == local_iter->af_family) {
if (!memcmp(&((struct sockaddr_in6 *) &local_iter->if_addr)->sin6_addr,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

If nodes have the same IP addresses (for containers or other purposes) and
these addresses get published as part of the modex, a remote peer might try to
use one of the addresses to connect. As both nodes have the same IP, there are
several cases:
- the "remote" port is not used by an OMPI process locally, the connection is
  refused or it timeouts. This is the "nicest" outcome, as a new IP will be
  used resulting in a successful connection and the continuation of the
  application.
- the "remote" port is used by another OMPI process on the local node. A
  connection will be established but the incorrect guid will be exchanged
  leading to complaints, connection dropped and/or deadlocks.
- the "remote" port is used by this process, basically resulting in a
  connection-to-self. Bad things happen, as we don't support TCP connections to
  self. Some output messages are generated, but the outcome is most likely a
  deadlock.

Up to now, users were expected to exclude such interfaces from the accepted
interfaces, but this patch removes this need. If we discover a local IP as part
of the IP list of a remote peer, we drop it and never try to use it. This does
not apply to local processes, so we can still use these interfaces for node
level communications (which will work as we will connect to the correct port
according to the destination process).

Signed-off-by: George Bosilca <[email protected]>
@bosilca bosilca force-pushed the fix/tcp_connection_s branch from 934289b to c50329f Compare March 6, 2024 04:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants