-
Notifications
You must be signed in to change notification settings - Fork 18k
runtime: SIGSEGV in runtime.(*fixalloc).alloc #47302
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Change https://golang.org/cl/336449 mentions this issue: |
We believe we have figured out the issue. mallocgc uses the mcache after releasem. A preemption between releasem and the use could free the mcache to the fixalloc free list, and then the adjustment to This code is not new in 1.17, however golang.org/cl/270943 added explicit preemption points between the releasem and the mcache, which makes this possible (while preemption was theoretically possible before, there were no morestack calls in the critical section). Thanks to @dr2chase and @mknyszek for finding the root cause, and @aclements and @ianlancetaylor and others for all the help! |
@dr2chase also came up with a small reproducer: https://play.golang.org/p/zioaIirWPSp Running under stress it had about a 1% failure rate. |
This reproducer, under stress, fails over 50%: https://play.golang.org/p/LjMLSsNtrLk |
By-the-way, technique for generating this was to run it under stress with a bunch of different seeds, record a list of failures, then modify to randomly chose from an array of just those known-failing seeds, and then choose the seed that failed the most (there were a few that failed at much higher rates). Then I repeated the pattern 10 times to see if it upped the failure rate (it did), then cut the sleep time to see if it still failed often (it did), then increased the repetition count some more to further increase the failure rate within a target no-fail runtime of about 5 seconds. |
Change https://golang.org/cl/359796 mentions this issue: |
This adds a maymorestack hook that forces a preemption at every possible cooperative preemption point. This would have helped us catch several recent preemption-related bugs earlier, including #47302, #47304, and #47441. For #48297. Change-Id: Ib82c973589c8a7223900e1842913b8591938fb9f Reviewed-on: https://go-review.googlesource.com/c/go/+/359796 Trust: Austin Clements <[email protected]> Run-TryBot: Austin Clements <[email protected]> TryBot-Result: Go Bot <[email protected]> Reviewed-by: Cherry Mui <[email protected]> Reviewed-by: Michael Pratt <[email protected]> Reviewed-by: David Chase <[email protected]>
On Go 1.17rc1 internally, we have seen numerous crashes on amd64/linux in
runtime.(*fixalloc).alloc
. The crashes do not occur on 1.16. The crashes all look something like this:These crashes have previously been observed on FreeBSD in #46103 and #46272, but were believed to be limited to FreeBSD. The latter bug also contains other types of crashes that we have not observed on Linux.
What we know so far:
fixalloc.list
here.fixalloc.list
indeed contains the fault address.mheap_.cachealloc.alloc
.procresize
.mheap_.cachealloc.list
to be set if GOMAXPROCS is decreased. In two cores checked so far,runtime.allp
haslen < cap
, indicating there was a indeed a decrease.mprotect
'd, but rather the pointer was always bad.0x3c8
.mheap_.cachealloc.list
, and in thecurg.sigcode1
,curg.m.gsignal
stack, andcurg.m.g0
stack. All of the latter references are from after the crash while we prepare to panic.cc @mknyszek @aclements
The text was updated successfully, but these errors were encountered: