Skip to content

runtime: possible memory corruption caused by CL 304470 "cmd/compile, runtime: add metadata for argument printing in traceback" #49075

Closed
@katiehockman

Description

@katiehockman

OSS-Fuzz reported an issue a few weeks ago that we suspect is memory corruption caused by the runtime. This started on August 16th, so is likely a Go 1.17 issue.

A slice bounds out of range issue is being reported from calls to regexp.MustCompile(,\s*).Split
However, this is not reproducible with the inputs provided by OSS-Fuzz, so we expect something else is going on.

Below are some of the panic logs:

panic: runtime error: slice bounds out of range [:18416820578376] with length 59413

goroutine 17 [running, locked to thread]:
regexp.(*Regexp).Split(0x10c0000b2640, {0x10c0001c801f, 0x76dbc0}, 0xffffffffffffffff)
	regexp/regexp.go:1266 +0x61c
github.com/google/gonids.(*Rule).option(0x10c000068000, {0x100c000096970, {0x10c0001c8016, 0x8}}, 0x10c00029a040)
	github.com/google/gonids/parser.go:675 +0x36cf
github.com/google/gonids.parseRuleAux({0x10c0001c8000, 0x630000350400}, 0x0)
	github.com/google/gonids/parser.go:943 +0x6b3
github.com/google/gonids.ParseRule(...)
	github.com/google/gonids/parser.go:972
github.com/google/gonids.FuzzParseRule({0x630000350400, 0x0, 0x10c000000601})
	github.com/google/gonids/fuzz.go:20 +0x54
main.LLVMFuzzerTestOneInput(...)
	./main.1689543426.go:21

panic: runtime error: slice bounds out of range [628255583:13888]

goroutine 17 [running, locked to thread]:
regexp.(*Regexp).Split(0x10c0000b2640, {0x10c00033601f, 0x76dbc0}, 0xffffffffffffffff)
	regexp/regexp.go:1266 +0x617
github.com/google/gonids.(*Rule).option(0x10c00026cc00, {0x100c00026e190, {0x10c000336016, 0x7}}, 0x10c0001a4300)
	github.com/google/gonids/parser.go:675 +0x36cf
github.com/google/gonids.parseRuleAux({0x10c000336000, 0x62f00064a400}, 0x0)
	github.com/google/gonids/parser.go:943 +0x6b3
github.com/google/gonids.ParseRule(...)
	github.com/google/gonids/parser.go:972
github.com/google/gonids.FuzzParseRule({0x62f00064a400, 0x0, 0x10c000000601})
	github.com/google/gonids/fuzz.go:20 +0x54
main.LLVMFuzzerTestOneInput(...)
	./main.1689543426.go:21
AddressSanitizer:DEADLYSIGNAL

panic: runtime error: slice bounds out of range [473357973:29412]

goroutine 17 [running, locked to thread]:
regexp.(*Regexp).Split(0x10c0000b2640, {0x10c0002a001f, 0x76dbc0}, 0xffffffffffffffff)
	regexp/regexp.go:1266 +0x617
github.com/google/gonids.(*Rule).option(0x10c0001b0180, {0x100c000280100, {0x10c0002a0016, 0xb}}, 0x10c0001ae040)
	github.com/google/gonids/parser.go:675 +0x36cf
github.com/google/gonids.parseRuleAux({0x10c0002a0000, 0x632000930800}, 0x0)
	github.com/google/gonids/parser.go:943 +0x6b3
github.com/google/gonids.ParseRule(...)
	github.com/google/gonids/parser.go:972
github.com/google/gonids.FuzzParseRule({0x632000930800, 0x0, 0x10c000000601})
	github.com/google/gonids/fuzz.go:20 +0x54
main.LLVMFuzzerTestOneInput(...)
	./main.1689543426.go:21

From rsc@:

The relevant code is processing the [][]int returned from regexp.(*Regexp).FindAllStringIndex.
That [][]int is prepared by repeated append:

func (re *Regexp) FindAllStringIndex(s string, n int) [][]int {
    if n < 0 {
        n = len(s) + 1
    }
    var result [][]int
    re.allMatches(s, nil, n, func(match []int) {
        if result == nil {
            result = make([][]int, 0, startSize)
        }
        result = append(result, match[0:2])
    })
    return result
}

Each of the match[0:2] being appended is prepared in regexp.(*Regexp).doExecute by:

dstCap = append(dstCap, m.matchcap...)

appending to a zero-length, non-nil slice to copy m.matchcap.

And each of the m.matchcap is associated with the *regexp.machine m, which is kept in a sync.Pool for reuse.

The specific corruption is that the integers in the [][]int are clear non-integers (like pointers),
which suggests that either one of the appends is losing the reference accidentally during GC
or something in sync.Pool is wonky.

This could also be something strange that OSS-Fuzz is doing, and doesn't necessarily represent a real-world use case.

/cc @golang/security

Activity

added
NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.
on Oct 19, 2021
randall77

randall77 commented on Oct 19, 2021

@randall77
Contributor

How often does this happen? Is there any way we could reproduce?
If reproducible, we could turn off the sync.Pool usage and see if anything changes.

added this to the Go1.18 milestone on Oct 19, 2021
rsc

rsc commented on Oct 20, 2021

@rsc
Contributor

@catenacyber says it is still happening multiple times a day on OSS-Fuzz.

Philippe, do you have any hints about an easy way to get a reproduction case on our own machines?
I looked briefly at the instructions for running oss-fuzz itself and they were a bit daunting.

catenacyber

catenacyber commented on Oct 20, 2021

@catenacyber
Contributor

How often does this happen?

Around 50 times a day on oss-fuzz

Is there any way we could reproduce?

I did not manage to reproduce it myself, did not try very hard though...

I looked briefly at the instructions for running oss-fuzz itself and they were a bit daunting.

Well, the hard thing is that this bug does not reproduce for a specific input.
But running it should be ok cf https://google.github.io/oss-fuzz/getting-started/new-project-guide/#testing-locally
That is

  • install docker
  • cd /path/to/oss-fuzz
  • python infra/helper.py build_image gonids
  • python infra/helper.py build_fuzzers gonids
  • python infra/helper.py run_fuzzer --corpus-dir=<path-to-temp-corpus-dir> gonids fuzz_parserule

Then, I guess you need to wait one hour, and relaunch the fuzzer if it did not trigger the bug, until it does

we could turn off the sync.Pool usage and see if anything changes.

Is there some environment variable to do so ?

catenacyber

catenacyber commented on Oct 20, 2021

@catenacyber
Contributor

Maybe oss-fuzz uses -fork=2 as an extra argument to run_fuzzer

randall77

randall77 commented on Oct 20, 2021

@randall77
Contributor

we could turn off the sync.Pool usage and see if anything changes.

Is there some environment variable to do so ?

No, you'd have to edit the code to replace pool allocations with new or make.

catenacyber

catenacyber commented on Oct 20, 2021

@catenacyber
Contributor

Is there some environment variable to do so ?

No, you'd have to edit the code to replace pool allocations with new or make.

so rebuild the standard library ?

randall77

randall77 commented on Oct 20, 2021

@randall77
Contributor

Just edit it, rebuild will be automatic.

catenacyber

catenacyber commented on Oct 20, 2021

@catenacyber
Contributor

So, I did google/oss-fuzz#6623 with regex.go not using sync package

rsc

rsc commented on Oct 20, 2021

@rsc
Contributor

google/oss-fuzz#6623 looks worth a shot. Thanks.

catenacyber

catenacyber commented on Oct 22, 2021

@catenacyber
Contributor

It looks like the bug is still happening but much less often.

One last stack trace is

panic: runtime error: slice bounds out of range [:107271103185152] with length 45246

goroutine 17 [running, locked to thread]:
regexp.(*Regexp).Split(0x10c0000b2640, {0x10c00010801f, 0x76dbc0}, 0xffffffffffffffff)
	regexp/regexp.go:1260 +0x61c
github.com/google/gonids.(*Rule).option(0x10c00034a180, {0x100c0007c4350, {0x10c000108016, 0x6}}, 0x10c000334080)
	github.com/google/gonids/parser.go:675 +0x36cf
github.com/google/gonids.parseRuleAux({0x10c000108000, 0x62e0000d8400}, 0x0)
	github.com/google/gonids/parser.go:943 +0x6b3
github.com/google/gonids.ParseRule(...)
	github.com/google/gonids/parser.go:972
github.com/google/gonids.FuzzParseRule({0x62e0000d8400, 0x0, 0x10c000000601})
	github.com/google/gonids/fuzz.go:20 +0x54
main.LLVMFuzzerTestOneInput(...)
	./main.3230035416.go:21
AddressSanitizer:DEADLYSIGNAL

regexp.go:1260 seems to prove that this is the modified regex.go file without sync right ?

Any more clues ?
Any more debug assertions to insert in regexp.go ?

josharian

josharian commented on Oct 22, 2021

@josharian
Contributor

This started on August 16th, so is likely a Go 1.17 issue.

Could you bisect to a particular commit that introduced the corruption?

catenacyber

catenacyber commented on Oct 22, 2021

@catenacyber
Contributor

Could you bisect to a particular commit that introduced the corruption?

oss-fuzz uses latest Golang release, so they switched from 1.16.x to 1.17 on August 16th, but we do not know which commit exactly in this major release induced the buggy behavior...

230 remaining items

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    FrozenDueToAgeNeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.WaitingForInfoIssue is not actionable because of missing required information, which needs to be provided.compiler/runtimeIssues related to the Go compiler and/or runtime.

    Type

    No type

    Projects

    Status

    Done

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @tmm1@josharian@rsc@cristaloleg@dgryski

        Issue actions

          runtime: possible memory corruption caused by CL 304470 "cmd/compile, runtime: add metadata for argument printing in traceback" · Issue #49075 · golang/go