Closed
Description
Various sweep increased allocation count
errors have started cropping up in or near go/build
invocations in various go
commands.
2020-03-15T08:13:55-32dbccd/linux-386-clang
2020-03-14T07:03:15-d774d97/linux-386-sid
2020-03-16T20:59:27-ff1eb42/linux-386-387
2020-03-15T08:13:55-32dbccd/linux-386-clang
2020-03-14T07:03:15-d774d97/linux-386-sid
2020-03-13T20:43:12-e2a9ea0/openbsd-386-62
Since go/build
is involved, this may be related to the go/types
crash cluster (#37602, #37507, #37690).
CC @griesemer @matloob @aclements @danscales @mknyszek @cherrymui
Metadata
Metadata
Assignees
Labels
Type
Projects
Relationships
Development
No branches or pull requests
Activity
bcmills commentedon Mar 17, 2020
These
found bad pointer in Go heap
errors seem to fit the same root cause:2020-03-16T22:31:39-2fbca94/linux-386-sid
2020-03-13T19:43:47-cbcb031/openbsd-386-62
[-]runtime: "sweep increased allocation count" during cmd/go package loading on 32-bit builders[/-][+]runtime: memory corruption in cmd/go on 32-bit builders[/+]bcmills commentedon Mar 17, 2020
This appears to be a dramatic uptick in memory corruption across the board on the 32-bit builders.
Given that it is only appearing on the 32-bit builders, and that we have not (to my knowledge) made any particularly racy or dangerous changes in
cmd/go
this cycle, marking as release-blocker for Go 1.15.bcmills commentedon Mar 17, 2020
I also can't rule out a compiler bug as the cause, given the number of changes to the rewrite rules so far. (CC @josharian @randall77)
bcmills commentedon Mar 17, 2020
The first failure in this cluster was the same day that CL 222782 (#36468) was merged, so I suspect that may be related.
[-]runtime: memory corruption in cmd/go on 32-bit builders[/-][+]cmd/compile: sporadic memory corruption on 32-bit builders[/+]bcmills commentedon Mar 17, 2020
Note that some of the affected builders are also TryBots:
https://go-review.googlesource.com/c/go/+/223745/5#message-7c0b4f426d49ae03a1ec88007cf714c5134d4800
ianlancetaylor commentedon Mar 19, 2020
https://storage.googleapis.com/go-build-log/a19069d2/linux-386_272d029f.log
jayconrod commentedon Mar 19, 2020
https://storage.googleapis.com/go-build-log/29e1c579/linux-386_8cf5204b.log
randall77 commentedon Mar 20, 2020
I'm kinda stumped on this one. I've tried to reproduce to no avail, and pouring over CL 222782 doesn't reveal anything that might cause intermittent issues like this.
On a possibly related note, I just mailed a change to add more addressing mode modifications for amd64. When that goes in, whether or not we start seeing this on amd64 will be illuminating.
[-]cmd/compile: sporadic memory corruption on 32-bit builders[/-][+]cmd/compile: sporadic memory corruption on 386 (32-bit) builders[/+]14 remaining items
Revert "cmd/compile: disable addressingmodes pass for 386"
Revert "cmd/compile: disable mem+op operations on 386"
Revert "cmd/compile: convert 386 port to use addressing modes pass"
gopherbot commentedon Mar 27, 2020
Change https://golang.org/cl/225798 mentions this issue:
cmd/compile: convert 386 port to use addressing modes pass (take 2)
randall77 commentedon Mar 27, 2020
I might have figured out what is wrong. Read the description of the CL above for all the gory details.
I'm not 100% sure, but enough so to try and submit this again and see what happens (I still can't reproduce outside trybots).
cmd/compile: convert 386 port to use addressing modes pass (take 2)
bcmills commentedon Mar 27, 2020
Unfortunately, https://build.golang.org/log/b634dbda2b877a9859e65cd5eba87577e9e33dee looks like it may be another instance of this failure after the most recent attempt.
cherrymui commentedon Mar 27, 2020
This looks somewhat different. The original failure looks like the GC found (temporary) bad pointers. The new one is a segfault.
gopherbot commentedon Mar 30, 2020
Change https://golang.org/cl/226437 mentions this issue:
cmd/compile: fix ephemeral pointer problem on amd64
cmd/compile: fix ephemeral pointer problem on amd64
randall77 commentedon Apr 2, 2020
I'm going to declare this fixed. The original CL with fixed rules is in, both for 386 and amd64.
I'm not seeing any similar failures on the dashboard since the fixes went in.