Skip to content

x/net/http2: downloading files from kernel.org results in PROTOCOL_ERROR stream error #19035

Closed
@stapelberg

Description

@stapelberg

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go version go1.8rc3 linux/amd64 (also reproducible using go version go1.7.1 linux/amd64)

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/michael/go"
GORACE=""
GOROOT="/home/michael/go1.8rc3"
GOTOOLDIR="/home/michael/go1.8rc3/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build924866453=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"
PKG_CONFIG="pkg-config"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"

What did you do?

https://play.golang.org/p/KUK5nfKxgu

What did you expect to see?

The program should terminate without an error.

What did you see instead?

(map reformatted for readability)

2017/02/11 11:31:59 resp = &{200 OK 200 HTTP/2.0 2 0 map[
Date:[Sat, 11 Feb 2017 10:31:59 GMT]
Content-Type:[application/x-xz]
Last-Modified:[Thu, 09 Feb 2017 07:19:24 GMT]
Accept-Ranges:[bytes]
Strict-Transport-Security:[max-age=15768000]
Server:[nginx]
Content-Length:[93208832]
Etag:["58e4100-54813ccf8f700"]
X-Frame-Options:[DENY]] {0xc4203e0500} 93208832 [] false false map[] 0xc42000a800 0xc4201aa370}
2017/02/11 11:32:59 stream error: stream ID 1; PROTOCOL_ERROR
panic: stream error: stream ID 1; PROTOCOL_ERROR

goroutine 1 [running]:
log.Panic(0xc420051f08, 0x1, 0x1)
	/home/michael/go1.8rc3/src/log/log.go:322 +0xc0
main.main()
	/tmp/poc.go:27 +0x283
exit status 2

I think this is an x/net/http2 related error, because running the program with the environment variable GODEBUG=http2client=0 works as expected.

Other HTTP2 clients, such as curl --http2 -O https://www.kernel.org/pub/linux/kernel/v4.x/linux-4.9.9.tar.xz, work as expected.

Activity

stapelberg

stapelberg commented on Feb 11, 2017

@stapelberg
ContributorAuthor

Here’s the result of running with GODEBUG=http2debug=2: debug.gz

added this to the Go1.9 milestone on Feb 11, 2017
bradfitz

bradfitz commented on Feb 11, 2017

@bradfitz
Contributor

Thanks for the report.

So, the trace says that kernel.org is sending us the PROTOCOL_ERROR:

...
2017/02/11 11:38:34 http2: Framer 0xc4200680d0: wrote WINDOW_UPDATE stream=1 len=4 incr=12288
2017/02/11 11:39:34 http2: Framer 0xc4200680d0: read RST_STREAM stream=1 len=4 ErrCode=PROTOCOL_ERROR

But I see nothing obviously wrong. I imagine its accounting of flow control differs from ours and it's complaining about that.

The response says they're using nginx.

Unfortunately, I'm unable to reproduce using your program. Oh, except the machine I'm running it on doesn't have IPv6, if that's relevant. (Your example connected via IPv6)

/cc @tombergan

stapelberg

stapelberg commented on Feb 11, 2017

@stapelberg
ContributorAuthor

Thanks for the quick reply.

IPv6 does not seem to be relevant: debug.gz is another trace of the same problem after deleting my IPv6 address and default route.

I can reproduce the issue on my Google workstation. I’ve granted you login permission to it, see the email I sent you on your corp account. Hope that helps.

Note that all computers from which I can reproduce this are located in Zürich, Switzerland, whereas all kernel.org mirrors (as per traceroute) are located in the US, i.e. ≥ 100ms of latency away.

tombergan

tombergan commented on Feb 13, 2017

@tombergan
Contributor

I can repro the problem. @stapelberg's example downloaded 155400 bytes before failing and my repro downloaded 85768 bytes before failing, so the failure appears to be nondeterministic. Go's WINDOW_UPDATEs look correct to me, in both traces. There is consistently a 1 minute gap between Go sending the last WINDOW_UPDATE and receiving the RST_STREAM(PROTOCOL_ERROR). My ping time to the server is only 5ms. This suggests we're hitting a timeout, possibly due to flow control not being properly updated as @bradfitz suggested. We might be hitting this code in nginx.

I do not see any WINDOW_UPDATEs on the stream, only on the connection. But Go sends a connection-level WINDOW_UPDATE with incr=1GB at startup, so nginx should not be blocked on connection-level flow control tokens.

I suspect a bug in nginx, since everything looks ok on Go's end, but I cannot be sure. @VBart, any ideas?

/cc @VBart

bradfitz

bradfitz commented on Apr 24, 2017

@bradfitz
Contributor

Ping @VBart.

VBart

VBart commented on Apr 24, 2017

@VBart

The exact reason of RST_STREAM should be in error log. But yes, this looks like a request timeout (see the send_timeout directive), which is 1 minute by default.

While from Go logs you see that WINDOW_UPDATE was sent to nginx, I suspect nginx didn't receive it and closed stream after a minute of waiting. That can be a network problem, e.g. some intermediate buffering. Enabling debug log in nginx will show the truth.

added
NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.
on Apr 24, 2017
bradfitz

bradfitz commented on Jun 14, 2017

@bradfitz
Contributor

It would be helpful if somebody could put together a standalone repro for this, ideally in a Docker container with both nginx and Go, along with any necessary network delaying (e.g. http://bencane.com/2012/07/16/tc-adding-simulated-network-latency-to-your-linux-server/), which probably doesn't run in a container.

But it also seems like an nginx bug to me.

@VBart says:

That can be a network problem, e.g. some intermediate buffering.

But Go only speaks HTTP/2 over TLS, so almost certainly not any intermediate buffering.

In any case, a repro that doesn't involve kernel.org so we can enable nginx debug logs and modify nginx source code at will would be super helpful.

modified the milestones: Go1.10, Go1.9 on Jun 14, 2017
bradfitz

bradfitz commented on Nov 29, 2017

@bradfitz
Contributor

I'm back from leave. Can somebody try and reproduce this and see if it's still happening, ideally with Go tip?

Maybe something in Go or something in nginx changed.

stapelberg

stapelberg commented on Nov 29, 2017

@stapelberg
ContributorAuthor

Cannot reproduce this anymore with go +21672b36eb:

% go run /tmp/http2.go
2017/11/29 08:31:46 resp = &{200 OK 200 HTTP/1.1 1 1 map[Date:[Wed, 29 Nov 2017 07:31:46 GMT] Content-Type:[application/x-xz] Strict-Transport-Security:[max-age=15768000] X-Frame-Options:[DENY] Server:[nginx] Content-Length:[93208832] Last-Modified:[Thu, 09 Feb 2017 07:19:24 GMT] Accept-Ranges:[bytes] X-Content-Type-Options:[nosniff] X-Frontend:[packetnet-amsterdam]] 0xc420072100 93208832 [] false false map[] 0xc420126000 0xc420530d10}
% 
locked and limited conversation to collaborators on Nov 29, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    FrozenDueToAgeNeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.help wanted

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @bradfitz@stapelberg@VBart@gopherbot@tombergan

        Issue actions

          x/net/http2: downloading files from kernel.org results in PROTOCOL_ERROR stream error · Issue #19035 · golang/go