Skip to content

runtime: morestack on gsignal signal: trace/breakpoint trap due to g0 stack misattribution #43853

Closed
@prattmic

Description

@prattmic

In rare cases on linux/amd64 race builds we've seen crashes that look like:

fatal: morestack on gsignal

signal: trace/breakpoint trap (core dumped)

The root cause is signal delivery on a sigaltstack allocated very close to the g0 stack. When cgo is enabled, mstart estimates the g0 stack bounds (cgo side), but this is a rough estimate and the g0 stack.lo may actually be beyond the end of the g0 stack.

On signal delivery, adjustSignalStack may then incorrectly determine that the signal was delivered on the g0 stack . Since the overlap is likely to be very close to g0 stack.lo, functions in signal handling have a high probability of "running out of stack space" and calling morestack. Boom.

Here's one example of overlap I captured:

Our SP on sigtrampgo entry: 0x7f99841fe328
sigaltstack from sigcontext: [0x7f99841ef000, 0x7f99841ff000)
g0 stack from gp.m.g0.stack: [0x7f99841fded8, 0x7f99849fdad8)

mstart contains a fudge factor of 1024 to try to address this inaccuracy, but checking against pthread_attr_getstack indicates that the mstart SP is actually 9616 bytes below the top of the stack (that may be off by 1 page (4096), I need to double check. Either way > 1024 bytes).

cc @cherrymui @aclements

Metadata

Metadata

Assignees

No one assigned

    Labels

    FrozenDueToAgeNeedsFixThe path to resolution is known, but the work has not been done.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions