Skip to content

runtime: frame pointer check fails on arm64 for C-to-Go calls #59401

Open
@nsrip-dd

Description

@nsrip-dd

What version of Go are you using (go version)?

$ go version
go version devel go1.21-8edcdddb23 Fri Mar 31 23:25:07 2023 +0000 darwin/arm64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="arm64"
GOBIN=""
GOCACHE="/Users/nick.ripley/Library/Caches/go-build"
GOENV="/Users/nick.ripley/Library/Application Support/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="arm64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/nick.ripley/go/pkg/mod"
GONOPROXY="--obfuscated--"
GONOSUMDB="--obfuscated--"
GOOS="darwin"
GOPATH="/Users/nick.ripley/go"
GOPRIVATE="--obfuscated--"
GOPROXY="--obfuscated--,direct"
GOROOT="/Users/nick.ripley/repos/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/Users/nick.ripley/repos/go/pkg/tool/darwin_arm64"
GOVCS=""
GOVERSION="devel go1.21-8edcdddb23 Fri Mar 31 23:25:07 2023 +0000"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/Users/nick.ripley/repos/go/src/go.mod"
GOWORK=""
CGO_CFLAGS="-O2 -g"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-O2 -g"
CGO_FFLAGS="-O2 -g"
CGO_LDFLAGS="-O2 -g"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/f3/g91d13pd6kd3vdxts_gsgd1r0000gn/T/go-build4033147164=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

Ran the TestCgoTracebackGoroutineProfile test with debugCheckBP enabled on an arm64 machine.

What did you expect to see?

The test to pass.

What did you see instead?

The test fails:

runtime: found invalid frame pointer
bp=0x170ee6da8 min=0x140001c4000 max=0x140001c4800
fatal error: bad frame pointer

runtime stack:
runtime.throw({0x10104f25d?, 0x10104c977?})
	/Users/nick.ripley/repos/go/src/runtime/panic.go:1076 +0x40 fp=0x170ee67b0 sp=0x170ee6780 pc=0x100f71520
runtime.adjustframe(0x170ee6a80, 0x170ee6960)
	/Users/nick.ripley/repos/go/src/runtime/stack.go:714 +0x678 fp=0x170ee68d0 sp=0x170ee67b0 pc=0x100f8b7f8
runtime.copystack(0x14000184820, 0x1000)
	/Users/nick.ripley/repos/go/src/runtime/stack.go:971 +0x470 fp=0x170ee6c00 sp=0x170ee68d0 pc=0x100f8c680
runtime.newstack()
	/Users/nick.ripley/repos/go/src/runtime/stack.go:1152 +0x35c fp=0x170ee6da0 sp=0x170ee6c00 pc=0x100f8cabc
runtime.morestack()
	/Users/nick.ripley/repos/go/src/runtime/asm_arm64.s:316 +0x70 fp=0x170ee6da0 sp=0x170ee6da0 pc=0x100fa4a00

goroutine 35 [copystack, locked to thread]:
runtime.deductSweepCredit(0x2000, 0x0)
	/Users/nick.ripley/repos/go/src/runtime/mgcsweep.go:866 +0x14c fp=0x1400011cb90 sp=0x1400011cb90 pc=0x100f613cc
runtime.(*mcentral).cacheSpan(0x1011bbd18)
	/Users/nick.ripley/repos/go/src/runtime/mcentral.go:84 +0x48 fp=0x1400011cc00 sp=0x1400011cb90 pc=0x100f51548
runtime.(*mcache).refill(0x101484a68, 0x0?)
	/Users/nick.ripley/repos/go/src/runtime/mcache.go:182 +0x1bc fp=0x1400011cc50 sp=0x1400011cc00 pc=0x100f50e9c
runtime.(*mcache).nextFree(0x101484a68, 0x5)
	/Users/nick.ripley/repos/go/src/runtime/malloc.go:855 +0x84 fp=0x1400011cca0 sp=0x1400011cc50 pc=0x100f48014
runtime.mallocgc(0x8, 0x0, 0x0)
	/Users/nick.ripley/repos/go/src/runtime/malloc.go:1017 +0x2ac fp=0x1400011cd10 sp=0x1400011cca0 pc=0x100f4841c
runtime.growslice(0x0, 0x100f7c238?, 0x140001c45b8?, 0x100f7c23c?, 0x140001c45e8?)
	/Users/nick.ripley/repos/go/src/runtime/slice.go:266 +0x3bc fp=0x1400011cd70 sp=0x1400011cd10 pc=0x100f89f1c
runtime.cgocallbackg1(0x101049b10, 0x0?, 0x1)
	/Users/nick.ripley/repos/go/src/runtime/cgocall.go:278 +0xd0 fp=0x1400011ce50 sp=0x1400011cd70 pc=0x100f3e8d0
runtime.cgocallbackg(0x140001c4728?, 0x170ee6ea7, 0x14000184820?)
	/Users/nick.ripley/repos/go/src/runtime/cgocall.go:242 +0x124 fp=0x1400011cee0 sp=0x1400011ce50 pc=0x100f3e774
runtime.cgocallbackg(0x101049b10, 0x170ee6ea7, 0x1)
	<autogenerated>:1 +0x1c fp=0x1400011cf10 sp=0x1400011cee0 pc=0x100fa8b9c
runtime.cgocallback(0x140001c4788, 0x101041120, 0x10104aa28)
	/Users/nick.ripley/repos/go/src/runtime/asm_arm64.s:1104 +0xb0 fp=0x1400011cf40 sp=0x1400011cf10 pc=0x100fa6e00
runtime.systemstack_switch()
	/Users/nick.ripley/repos/go/src/runtime/asm_arm64.s:200 +0x8 fp=0x1400011cf50 sp=0x1400011cf40 pc=0x100fa48c8
runtime.cgocall(0x10104aa28, 0x140001c47c8)
	/Users/nick.ripley/repos/go/src/runtime/cgocall.go:168 +0x70 fp=0x1400011cf90 sp=0x1400011cf50 pc=0x100f3e5f0
main._Cfunc_CallGoSleep()
	_cgo_gotypes.go:177 +0x30 fp=0x1400011cfc0 sp=0x1400011cf90 pc=0x101041120
main.GoroutineProfile.func1()
	/Users/nick.ripley/repos/go/src/runtime/testdata/testprogcgo/gprof.go:33 +0x24 fp=0x1400011cfd0 sp=0x1400011cfc0 pc=0x101049944
runtime.goexit()
	/Users/nick.ripley/repos/go/src/runtime/asm_arm64.s:1196 +0x4 fp=0x1400011cfd0 sp=0x1400011cfd0 pc=0x100fa6ef4
created by main.GoroutineProfile in goroutine 1
	/Users/nick.ripley/repos/go/src/runtime/testdata/testprogcgo/gprof.go:33 +0x40

With additional stack debug logging turned on, I see that the check fails when visiting the runtime.cgocallbackg frame:

Debug logging output
copystack gp=0x14000184820 [0x140001c4000 0x140001c4390 0x140001c4800] -> [0x1400011c000 0x1400011cb90 0x1400011d000]/4096
                adjust ptr 0x14000184888:0x140001c4388 -> 0x1400011cb88
                adjust ptr 0x1400011cb88:0x140001c43f8 -> 0x1400011cbf8
            adjusting runtime.deductSweepCredit frame=[0x1400011cb90,0x1400011cb90] pc=0x100f613cc continpc=0x100f613cc
            adjusting runtime.(*mcentral).cacheSpan frame=[0x1400011cb90,0x1400011cc00] pc=0x100f51548 continpc=0x100f51548
              locals 0/3 2 words 0x1010599e4
              saved bp
                adjust ptr 0x1400011cbf8:0x140001c4448 -> 0x1400011cc48
            adjusting runtime.(*mcache).refill frame=[0x1400011cc00,0x1400011cc50] pc=0x100f50e9c continpc=0x100f50e9c
              locals 1/2 1 words 0x10105932d
              saved bp
                adjust ptr 0x1400011cc48:0x140001c4498 -> 0x1400011cc98
            adjusting runtime.(*mcache).nextFree frame=[0x1400011cc50,0x1400011cca0] pc=0x100f48014 continpc=0x100f48014
              locals 1/2 1 words 0x10105932d
              saved bp
                adjust ptr 0x1400011cc98:0x140001c4508 -> 0x1400011cd08
            adjusting runtime.mallocgc frame=[0x1400011cca0,0x1400011cd10] pc=0x100f4841c continpc=0x100f4841c
              locals 2/7 3 words 0x10105b6be
              saved bp
                adjust ptr 0x1400011cd08:0x140001c4568 -> 0x1400011cd68
              args
            adjusting runtime.growslice frame=[0x1400011cd10,0x1400011cd70] pc=0x100f89f1c continpc=0x100f89f1c
              locals 1/5 1 words 0x10105a715
              saved bp
                adjust ptr 0x1400011cd68:0x140001c4648 -> 0x1400011ce48
              args
            adjusting runtime.cgocallbackg1 frame=[0x1400011cd70,0x1400011ce50] pc=0x100f3e8d0 continpc=0x100f3e8d0
              locals 1/4 13 words 0x10105df4e
              saved bp
                adjust ptr 0x1400011ce48:0x140001c46d8 -> 0x1400011ced8
              args
            adjusting runtime.cgocallbackg frame=[0x1400011ce50,0x1400011cee0] pc=0x100f3e774 continpc=0x100f3e774
              locals 3/6 4 words 0x10105b2f7
        adjust ptr 0x140001c4750 runtime.cgocallbackg
              saved bp

It appears that the runtime.cgocall assembly routine attempts to put a frame pointer on the Go stack when switching to it. But from this test failure, it seems like we're getting the wrong frame pointer? Possibly an address on the C stack (when the thread is created by C rather than by the runtime)?

Activity

mknyszek

mknyszek commented on Apr 3, 2023

@mknyszek
Contributor

I believe there are known issues with frame pointers on arm64, and there are some issues open about it and progress being made. I'll dig those up in a bit.

EDIT: Whoops, I just saw who posted this. 😅

added
NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.
on Apr 3, 2023
added this to the Backlog milestone on Apr 3, 2023
nsrip-dd

nsrip-dd commented on Apr 3, 2023

@nsrip-dd
ContributorAuthor

Ah, I forgot to mention other relevant issues, thanks! There's at least #39524, and #58432 is also related. I opened this issue to document a specific case that's not working right now :)

Also cc @felixge @aarzilli

moved this to In Progress in Go Compiler / Runtimeon Apr 5, 2023
nsrip-dd

nsrip-dd commented on Aug 15, 2023

@nsrip-dd
ContributorAuthor

runtime.cgocallbackg is indeed picking up whatever was in the frame pointer register from the C code. So the frame pointer check enabled with debugCheckBP fails when it checks runtime.cgocallbackg's frame, seeing that the frame pointer is pointing outside of the goroutine's stack.

I think there are two ways we could address this:

  • Have the frame pointer check ignore the runtime.cgocallbackg frame, assuming that it might inherit the frame pointer from the C code calling into Go.
  • Have runtime.cgocallback set the frame pointer explicitly before calling into runtime.cgocallbackg. Say, to 0, or to point to the previous frame on the Go stack if there is one?

I think it would be good to start with the first option. Right now we don't use frame pointer unwinding for execution tracing if there are C calls on the stack, specifically because of this kind of problem. We could follow that pattern for future uses of frame pointer unwinding internally (e.g. #16638). And in that case, we know that runtime.cgocallbackg's frame won't be visited by frame pointer unwinding, so there would be no value in checking it.

We could consider the second option in the future. It would probably be worth a little more discussion, though. For example, if we wanted to be able to use frame pointer unwinding even with C frames on the stack, we'd still want a way to take advantage of runtime.SetCgoTraceback handlers if they're present.

aarzilli

aarzilli commented on Aug 15, 2023

@aarzilli
Contributor

Have you considered changing asmcgocall/cgocallback so they save on the stack, instead of BP, the value of BP - curg.stack.hi: a stacktracer can recognize that the value is special (because it's negative/it has the msb set) and adjust it by reading the stack bounds from the current goroutine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.compiler/runtimeIssues related to the Go compiler and/or runtime.

Type

No type

Projects

Status

In Progress

Relationships

None yet

    Development

    No branches or pull requests

      Participants

      @aarzilli@mknyszek@gopherbot@nsrip-dd

      Issue actions

        runtime: frame pointer check fails on arm64 for C-to-Go calls · Issue #59401 · golang/go