Skip to content

crypto/tls: SMTP connection timeout in Go Version 1.23 #71257

Duplicate of#70232
@littlerest1

Description

@littlerest1

Go version

go version go1.23.2 darwin/arm64

Output of go env in your module/workspace:

GO111MODULE=''
GOARCH='arm64'
GOBIN=''
GOCACHE='/Users/yanz1/Library/Caches/go-build'
GOENV='/Users/yanz1/Library/Application Support/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='arm64'
GOHOSTOS='darwin'
GOINSECURE=''
GOMODCACHE='/Users/yanz1/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='darwin'
GOPATH='/Users/yanz1/go'
GOPRIVATE=''
GOPROXY=''
GOROOT='/opt/homebrew/Cellar/go@1.22/1.23.2/libexec'
GOSUMDB='off'
GOTMPDIR=''
GOTOOLCHAIN='local'
GOTOOLDIR='/opt/homebrew/Cellar/go@1.23/1.23.2/libexec/pkg/tool/darwin_arm64'
GOVCS=''
GOVERSION='go1.23.2'
GCCGO='gccgo'
AR='ar'
CC='cc'
CXX='c++'
CGO_ENABLED='1'
GOMOD='/dev/null'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/s4/j38_716j1q1dmb564z0ynfq00000gq/T/go-build782530738=/tmp/go-build -gno-record-gcc-switches -fno-common'

What did you do?

Was using library crypto/tls and net for smtp dialing and communication
the code is working fine in MacOs arm64 system. However the binary file built by command env GOOS=linux GOARCH=amd64 go build -o and run in AWS Linux EC2 instance will receive read tcp: timeout issue. This issue is resolved by downgrading the Go version to 1.22.


package main

import (
	"crypto/tls"
	"io"
	"fmt"
	"log"
	"net"
	"strings"
	"time"
)

func main() {
	smtpHost := "PRIVATE HOST"
	smtpPort := "587"

	tlsConfig := &tls.Config{
		ServerName: smtpHost,
	}

	dialer := &net.Dialer{
		Timeout: 10 * time.Second,
	}
	conn, err := dialer.Dial("tcp", smtpHost+":"+smtpPort)

	if err != nil {
		log.Fatal("failed to create tlsDialer:", "error", err)
	}

	fmt.Fprintf(conn, "STARTTLS\r\n")
	buf := make([]byte, 0, 4096) // big buffer
	tmp := make([]byte, 256)     // using small tmo buffer for demonstrating
	for {
		n, err := conn.Read(tmp)
		log.Println(string(tmp))
		if err != nil {
			if err != io.EOF {
				log.Fatal("read error:", err)
			}
			break
		}
		if strings.Contains(string(tmp), "Go ahead with TLS") {
			client := tls.Client(conn, tlsConfig)
			fmt.Fprintf(client, "AUTH LOGIN\r\n") // Time out happens here. Tried to send`HLEO` also received timeout issue
			log.Println("Start TLS connection and request AUTH")
		}

		///log.Println("got", string(tmp))
		buf = append(buf, tmp[:n]...)
	}
	log.Println("total size:", len(buf))

}

What did you see happen?

{
    "time": "2025-01-07T09:16:34.914889396Z",
    "level": "WARN",
    "msg": "error while sending email",
    "error": "read tcp 10.33.124.75:46246->10.248.46.8:587: read: connection timed out"
}

What did you expect to see?

AUTH LOGIN command successfully send after STARTTLS command and receive the request for username and password for further authentication

Activity

added
BugReportIssues describing a possible bug in the Go implementation.
on Jan 14, 2025
ianlancetaylor

ianlancetaylor commented on Jan 14, 2025

@ianlancetaylor
Contributor

CC @golang/security

seankhliao

seankhliao commented on Jan 14, 2025

@seankhliao
Member

maybe the same as #70232

added
WaitingForInfoIssue is not actionable because of missing required information, which needs to be provided.
NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.
on Jan 14, 2025
littlerest1

littlerest1 commented on Jan 15, 2025

@littlerest1
Author

maybe the same as #70232

seems like so

mjl-

mjl- commented on Jan 22, 2025

@mjl-

@littlerest1 I am not sure this is the same situation. The other issue is for a specific non-Go TLS client connecting to a Go TLS server. And the symptom is that that other client is dropping the connection, there is no timeout involved. You are using a Go TLS client. Which software is the TLS server?

Looking at your code, I have a few suggestions for the code you're using:

  • You're looking for "Go ahead with TLS", but the conn.Read just before may not have consumed the full the line. If that's the case, the tls client is initialized on data that comes before the start of the TLS packets. That may cause bytes to be interpreted as something else, e.g. a large packet size, larger than what remote is sending, resulting in a read timeout eventually. You should read to the end of the line. Log what you've read with "%q", it will show if you've consumed the newline.
  • Handle errors from fmt.Printf on the connection/client. May reveal useful information.
  • After making the tls client, explicitly do the handshake with client.Handshake() and handle any errors. May help a little understanding which i/o exactly is timing out.
  • Also use tmp[:n] instead of tmp when logging.
littlerest1

littlerest1 commented on Jan 23, 2025

@littlerest1
Author

@littlerest1 I am not sure this is the same situation. The other issue is for a specific non-Go TLS client connecting to a Go TLS server. And the symptom is that that other client is dropping the connection, there is no timeout involved. You are using a Go TLS client. Which software is the TLS server?

Looking at your code, I have a few suggestions for the code you're using:

  • You're looking for "Go ahead with TLS", but the conn.Read just before may not have consumed the full the line. If that's the case, the tls client is initialized on data that comes before the start of the TLS packets. That may cause bytes to be interpreted as something else, e.g. a large packet size, larger than what remote is sending, resulting in a read timeout eventually. You should read to the end of the line. Log what you've read with "%q", it will show if you've consumed the newline.
  • Handle errors from fmt.Printf on the connection/client. May reveal useful information.
  • After making the tls client, explicitly do the handshake with client.Handshake() and handle any errors. May help a little understanding which i/o exactly is timing out.
  • Also use tmp[:n] instead of tmp when logging.

The reason I was using the tls dial, its because the is no reasonable error log when I was using net/smtp library. So that I demonstrate the email sending process one by one.

Will try better logging for more info about why it get io timeout

mjl-

mjl- commented on Jan 23, 2025

@mjl-

ah yeah, that makes sense. when debugging, looking at the traffic with tcpdump may also be helpful (and easy to do).

thomvaill

thomvaill commented on Feb 18, 2025

@thomvaill
Contributor

Hi,
We have exactly the same issue. Our net/smtp client was working fine on go 1.22.
After upgrading to 1.23, we get these timeouts.
After debugging like you did thanks to your snippet, it also times out at the TLS handshake step.
I agree with @mjl- : this is a different issue than #70232

@littlerest1 have you managed to find a workaround?
On our side we tried to mess with the allowed ciphers but without success

Could this issue be reopened?

thomvaill

thomvaill commented on Feb 19, 2025

@thomvaill
Contributor

I managed to find 2 workarounds!

  1. Re-implement net/smtp SendEmail() function to be able to pass this custom tls.Config option:
config := &tls.Config{
    MaxVersion: tls.VersionTLS12,
}

Because in our case our SMTP server only accepts TLS v1.2, and with this fix it works

  1. Or just set GODEBUG=tlskyber=0

So the regression looks like to come from this commit: d0edd9a
Which is documented here: https://tip.golang.org/doc/go1.23

The experimental post-quantum key exchange mechanism X25519Kyber768Draft00 is now enabled by default when Config.CurvePreferences is nil. The default can be reverted by adding tlskyber=0 to the GODEBUG environment variable.

We definitely should re-open this issue or open a new one.
I would love to open a new one but unfortunately I was only able to reproduce the issue with our company's internal SMTP server (I don't know the software they use), not a public one. I also tested to create a custom one with go, but still not able to reproduce.
So the bug must be related to some implementations of TLS v1.2 on the server side :(

seankhliao

seankhliao commented on Feb 19, 2025

@seankhliao
Member

if it's the remote server breaking on kyber, it's not really considered a bug in go, see https://tldr.fail/

thomvaill

thomvaill commented on Feb 20, 2025

@thomvaill
Contributor

Thank you @seankhliao you are completely right!
I totally understand the progressive rollout of post-quantum algorithms, but I think a better communication with some disclaimers should have been issued.
This bug is very difficult to debug when you don't have the knowledge provided by https://tldr.fail/ (very helpful by the way!)
I saw in the changelog that Kyber was removed in go 1.24 and replaced by another one, the problem must be the same?
Maybe the Go team should communicate more about this

rolandshoemaker

rolandshoemaker commented on Feb 20, 2025

@rolandshoemaker
Member

We could perhaps add a note to the 1.24 release notes (and perhaps the crypto/tls package) that usage of the X25519MLKEM768 kex (the name of the standardized Kyber kex) may cause failures with TLS stacks that don't handle large records correctly.

gopherbot

gopherbot commented on Feb 20, 2025

@gopherbot
Contributor

Change https://go.dev/cl/651036 mentions this issue: doc: add a disclamer for the TLS handshake timeouts TL;DR bug in godebug.md

thomvaill

thomvaill commented on Feb 20, 2025

@thomvaill
Contributor

We could perhaps add a note to the 1.24 release notes (and perhaps the crypto/tls package) that usage of the X25519MLKEM768 kex (the name of the standardized Kyber kex) may cause failures with TLS stacks that don't handle large records correctly.

You are right! Usually when you have regressions like this after an upgrade, you instinctively check the release notes, so it might help.
I created 2 PRs:

Regarding the crypto/tls package I was not sure where to document this, so I didn't.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugReportIssues describing a possible bug in the Go implementation.NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.WaitingForInfoIssue is not actionable because of missing required information, which needs to be provided.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @mjl-@ianlancetaylor@rolandshoemaker@gopherbot@thomvaill

        Issue actions

          crypto/tls: SMTP connection timeout in Go Version 1.23 · Issue #71257 · golang/go