Skip to content

runtime: unexpected signal during runtime execution #41099

Closed
@dnephin

Description

@dnephin

What version of Go are you using (go version)?

go1.14.7, also observed the same problem with earlier versions: go1.13.7

Does this issue reproduce with the latest release?

Unknown

What operating system and processor architecture are you using (go env)?

Linux 4.14.35-1818.3.3.el7uek.x86_64 #2 SMP Mon Sep 24 14:45:01 PDT 2018 x86_64 x86_64 x86_64 GNU/Linux
(Oracle Linux Server release 7.6)

What did you do?

Original issue reported here: hashicorp/consul#8558

The Consul binary appears to be crashing with errors from the runtime, ex:

  • fatal error: stopg: invalid status
  • fatal error: unexpected signal during runtime execution
  • [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x0]
  • runtime: unexpected return pc for runtime.sigpanic called from 0x0

The stack trace from the application shows the line as a select blocking on a channel from signal.Notify (source). I'm not sure if that is related to the underlying issue, or has to do with the signal from the runtime.

Full trace
runtime: gp: gp=0xc000450180, goid=0, gp->atomicstatus=0
runtime:  g:  g=0xc0005b5680, goid=0,  g->atomicstatus=0
fatal error: stopg: invalid status
runtime: gp: gp=0xc000450300, goid=0, gp->atomicstatus=0
runtime:  g:  g=0xc000604180, goid=0,  g->atomicstatus=0
fatal error: unexpected signal during runtime execution
fatal error: stopg: invalid status
runtime: gp: gp=0xc000450000, goid=0, gp->atomicstatus=0
runtime:  g:  g=0xc000566180, goid=0,  g->atomicstatus=0
fatal error: stopg: invalid status
runtime stack:
runtime.throw(0x2fac299, 0x15)
/usr/local/go/src/runtime/panic.go:774 +0x72
runtime.scang(0xc000450180, 0xc000060170)
/usr/local/go/src/runtime/proc.go:869 +0x379
runtime.markroot.func1()
/usr/local/go/src/runtime/mgcmark.go:221 +0x6d
runtime.markroot(0xc000060170, 0xc00000001b)
/usr/local/go/src/runtime/mgcmark.go:202 +0x2f3
runtime.gcDrain(0xc000060170, 0x7)
/usr/local/go/src/runtime/mgcmark.go:915 +0x110
runtime.gcBgMarkWorker.func2()
/usr/local/go/src/runtime/mgc.go:1936 +0x16b
runtime.systemstack(0x0)
/usr/local/go/src/runtime/asm_amd64.s:370 +0x66
runtime.mstart()
/usr/local/go/src/runtime/proc.go:1146
goroutine 34 [GC worker (idle)]:
runtime.systemstack_switch()
/usr/local/go/src/runtime/asm_amd64.s:330 fp=0xc00028ef60 sp=0xc00028ef58 pc=0x45d6d0
runtime.gcBgMarkWorker(0xc00005ef00)
/usr/local/go/src/runtime/mgc.go:1900 +0x1bd fp=0xc00028efd8 sp=0xc00028ef60 pc=0x41c38d
runtime.goexit()
/usr/local/go/src/runtime/asm_amd64.s:1357 +0x1 fp=0xc00028efe0 sp=0xc00028efd8 pc=0x45f621
created by runtime.gcBgMarkStartWorkers
/usr/local/go/src/runtime/mgc.go:1794 +0x77
goroutine 1 [select, 149 minutes]:
github.com/hashicorp/consul/command/agent.(*cmd).run(0xc000651500, 0xc000170020, 0x8, 0x8, 0x0)
/home/circleci/project/consul/command/agent/agent.go:331 +0x13eb
github.com/hashicorp/consul/command/agent.(*cmd).Run(0xc000651500, 0xc000170020, 0x8, 0x8, 0xc0001452a0)
/home/circleci/project/consul/command/agent/agent.go:78 +0x4d
github.com/mitchellh/cli.(*CLI).Run(0xc00031c000, 0xc00031c000, 0x80, 0xc000145ac0)
/go/pkg/mod/github.com/mitchellh/cli@v1.1.0/cli.go:260 +0x1da
main.realMain(0xc00014a058)
/home/circleci/project/consul/main.go:53 +0x393
main.main()
/home/circleci/project/consul/main.go:20 +0x22
goroutine 20 [syscall, 149 minutes]:
os/signal.signal_recv(0x0)
/usr/local/go/src/runtime/sigqueue.go:147 +0x9c
os/signal.loop()
/usr/local/go/src/os/signal/signal_unix.go:23 +0x22
created by os/signal.init.0
/usr/local/go/src/os/signal/signal_unix.go:29 +0x41
goroutine 23 [chan receive]:
github.com/golang/glog.(*loggingT).flushDaemon(0x50186a0)
/go/pkg/mod/github.com/golang/glog@v0.0.0-20160126235308-23def4e6c14b/glog.go:882 +0x8b
created by github.com/golang/glog.init.0
/go/pkg/mod/github.com/golang/glog@v0.0.0-20160126235308-23def4e6c14b/glog.go:410 +0x26f
goroutine 0 [idle]:
fatal error: unexpected signal during runtime execution
panic during panic
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x452486]
runtime stack:
runtime.throw(0x300b2ed, 0x2a)
/usr/local/go/src/runtime/panic.go:774 +0x72
runtime.sigpanic()
/usr/local/go/src/runtime/signal_unix.go:378 +0x47c
runtime.gentraceback(0xffffffffffffffff, 0xffffffffffffffff, 0x0, 0xc000450000, 0x0, 0x0, 0x64, 0x0, 0x0, 0x0, ...)
/usr/local/go/src/runtime/traceback.go:159 +0x156
runtime.traceback1(0xffffffffffffffff, 0xffffffffffffffff, 0x0, 0xc000450000, 0x0)
/usr/local/go/src/runtime/traceback.go:722 +0xf0
runtime.traceback(0xffffffffffffffff, 0xffffffffffffffff, 0x0, 0xc000450000)
/usr/local/go/src/runtime/traceback.go:676 +0x52
runtime.tracebackothers(0xc0005b5680)
/usr/local/go/src/runtime/traceback.go:929 +0x1a6
runtime.dopanic_m(0xc0005b5680, 0x42f602, 0xc000aabdf8, 0x1)
/usr/local/go/src/runtime/panic.go:974 +0x29e
runtime.fatalthrow.func1()
/usr/local/go/src/runtime/panic.go:829 +0x5f
runtime.fatalthrow()
/usr/local/go/src/runtime/panic.go:826 +0x57
runtime.throw(0x2fac299, 0x15)
/usr/local/go/src/runtime/panic.go:774 +0x72
runtime.scang(0xc000450180, 0xc000060170)
/usr/local/go/src/runtime/proc.go:869 +0x379
runtime.markroot.func1()
/usr/local/go/src/runtime/mgcmark.go:221 +0x6d
runtime.markroot(0xc000060170, 0xc00000001b)
/usr/local/go/src/runtime/mgcmark.go:202 +0x2f3
runtime.gcDrain(0xc000060170, 0x7)
/usr/local/go/src/runtime/mgcmark.go:915 +0x110
runtime.gcBgMarkWorker.func2()
/usr/local/go/src/runtime/mgc.go:1936 +0x16b
runtime.systemstack(0x0)
/usr/local/go/src/runtime/asm_amd64.s:370 +0x66
runtime.mstart()
/usr/local/go/src/runtime/proc.go:1146
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x0]
runtime stack:
runtime: unexpected return pc for runtime.sigpanic called from 0x0
stack: frame={sp:0xc000c61dd8, fp:0xc000c61e08} stack=[0xc000c60000,0xc000c62000)
000000c000c61cd8:  000000c0007bc180  01000000000003e8
000000c000c61ce8:  0000000000000004  000000000000001f
000000c000c61cf8:  0000000000000000  0000000000000000
000000c000c61d08:  0000000000000001  0000000002fdb0bd
000000c000c61d18:  000000c000c61d60  000000000045bfbf <runtime.fatalthrow.func1+95>

I looked around this issue tracker for an existing issue. I saw some related things, but nothing that sounded like exactly the same problem.

One issue mentioned C signal handlers. There should not be any such handlers in this binary.

Activity

davecheney

davecheney commented on Aug 28, 2020

@davecheney
Contributor

Sadly, with the release of go 1.15, go 1.13 is no longer receiving active support.

Can you please try with the latest release of Go can confirm if the problem occurs.

dnephin

dnephin commented on Aug 28, 2020

@dnephin
ContributorAuthor

Ah, that's right, thank you. I've asked they try with a more recent release, which uses go1.14.x.

changed the title [-][go1.13.15] fatal error: unexpected signal during runtime execution[/-] [+]runtime: unexpected signal during runtime execution[/+] on Aug 30, 2020
added
WaitingForInfoIssue is not actionable because of missing required information, which needs to be provided.
on Aug 30, 2020
dnephin

dnephin commented on Sep 8, 2020

@dnephin
ContributorAuthor

More details from hashicorp/consul#8558 (comment). They were able to reproduce with go1.14.7. The full log is attached. They also said it occurs only on hosts with Intel Optane connected in RAM mode.

The stack is a bit different, but the errors look similar

fatal error: unexpected signal during runtime execution
fatal error: runtimer: bad p
runtime: unexpected return pc for runtime.sigpanic called from 0x0
fatal error: cleantimers: bad p

consul-agent-kn-0030_2020.09.02_trace.log

networkimprov

networkimprov commented on Sep 8, 2020

@networkimprov

cc @aclements @randall77 @prattmic

@gopherbot remove WaitingForInfo

removed
WaitingForInfoIssue is not actionable because of missing required information, which needs to be provided.
on Sep 8, 2020
added
NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.
on Sep 8, 2020
added this to the Unplanned milestone on Sep 8, 2020
ianlancetaylor

ianlancetaylor commented on Sep 8, 2020

@ianlancetaylor
Contributor

This looks a lot like memory corruption.

fanatl

fanatl commented on Sep 9, 2020

@fanatl

Hello to all.
These fatal errors have appeared since version 1.12 and older.
And is repeated only on hosts with Intel optane as RAM.

davecheney

davecheney commented on Sep 9, 2020

@davecheney
Contributor

All the more reason to determine if the binary contains data races or cgo related memory corruption.

networkimprov

networkimprov commented on Sep 9, 2020

@networkimprov

@dnephin have you built the app with the race detector?

dnephin

dnephin commented on Sep 9, 2020

@dnephin
ContributorAuthor

I've been working on running the tests with the race detector enabled. We run some packages with -race, but not all. We don't build binaries with it enabled. Would that help identify the problem?

networkimprov

networkimprov commented on Sep 9, 2020

@networkimprov

A data race could cause memory corruption, so building the app with the race detector may help.

woblerr

woblerr commented on Feb 20, 2021

@woblerr

@dnephin any new about this problem?

blanchonvincent

blanchonvincent commented on May 16, 2021

@blanchonvincent

@dnephin @woblerr I got the same issues in my code, and it broke since Go 1.14 with the introduction of the goroutine async preemption (I do not have control over the C code I am using, but it looks like it mess up with the signals), so I just disable the async preemption when running my tests/building the binary. Your logs suggests it could be the same issue (crash in the garbage collector that preempt the goroutines). Can you try running your tests with GODEBUG=asyncpreemptoff=1 like GODEBUG=asyncpreemptoff=1 go test ./... to see if it is the cause?

added
WaitingForInfoIssue is not actionable because of missing required information, which needs to be provided.
and removed
NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.
on May 17, 2021
gopherbot

gopherbot commented on Jun 17, 2021

@gopherbot
Contributor

Timed out in state WaitingForInfo. Closing.

(I am just a bot, though. Please speak up if this is a mistake or you have the requested information.)

networkimprov

networkimprov commented on Jun 22, 2021

@networkimprov

I previously proposed a way to disable async preemption in all builds of an app; see #40870

It was closed without action, but you might want to comment there to request reconsideration of the idea.

locked and limited conversation to collaborators on Jun 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    FrozenDueToAgeWaitingForInfoIssue is not actionable because of missing required information, which needs to be provided.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @davecheney@cagedmantis@andybons@dnephin@networkimprov

        Issue actions

          runtime: unexpected signal during runtime execution · Issue #41099 · golang/go