Skip to content

x/tools/gopls: out-of-bounds slice panic in bug in frob.(*reader).bytes #71244

@adonovan

Description

@adonovan
#!stacks
"runtime.goPanicSliceAcap" && "frob.(*reader).bytes"

Issue created by stacks.

func (r *reader) bytes(n int) []byte {
	v := r.data[:n]
	r.data = r.data[n:] // <--- panic
	return v
}

This stack MZjt-Q was reported by telemetry:

golang.org/x/tools/gopls@v0.17.1 go1.23.3 linux/amd64 vscode (1)

Activity

added
goplsIssues related to the Go language server, gopls.
NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.
ToolsThis label describes issues relating to any tools in the x/tools repository.
on Jan 13, 2025
added this to the Unreleased milestone on Jan 13, 2025
adonovan

adonovan commented on Jan 13, 2025

@adonovan
MemberAuthor

If the line number is accurate, the r.data[n:] operation is implicated, but it cannot fail if the r.data[:n] operation on the previous line succeeded. So this makes me doubt the line number.

Looking at the disassembly, the check whose failure calls panicSliceACap is the very first thing the function does after its prolog:

golang.org/x/tools/gopls/internal/util/frob.(*reader).bytes STEXT size=176 args=0x10 locals=0x18 funcid=0x0 align=0x0
	0x0000 00000 	L0390	TEXT	golang.org/x/tools/gopls/internal/util/frob.(*reader).bytes(SB), ABIInternal, $32-16
	0x0000 00000 	L0390	MOVD	16(g), R16
	0x0004 00004 	L0390	PCDATA	$0, $-2
	0x0004 00004 	L0390	CMP	R16, RSP
	0x0008 00008 	L0390	BLS	148
	0x000c 00012 	L0390	PCDATA	$0, $-1
	0x000c 00012 	L0390	MOVD.W	R30, -32(RSP)
	0x0010 00016 	L0390	MOVD	R29, -8(RSP)
	0x0014 00020 	L0390	SUB	$8, RSP, R29
	0x0018 00024 	L0390	FUNCDATA	$0, gclocals·2NSbawKySWs0upw55xaGlw==(SB)
	0x0018 00024 	L0390	FUNCDATA	$1, gclocals·ISb46fRPFoZ9pIfykFK/kQ==(SB)
	0x0018 00024 	L0390	FUNCDATA	$5, golang.org/x/tools/gopls/internal/util/frob.(*reader).bytes.arginfo1(SB)
	0x0018 00024 	L0390	FUNCDATA	$6, golang.org/x/tools/gopls/internal/util/frob.(*reader).bytes.argliveinfo(SB)
	0x0018 00024 	L0390	PCDATA	$3, $1
	0x0018 00024 	L0391	MOVD	16(R0), R2   ; r2 = r.data.cap
	0x001c 00028 	L0391	CMP	R2, R1     ; n < r.data.cap
	0x0020 00032 	L0391	BHI	140   ;  if n >= cap(r.data) ...
...
	0x008c 00140 	L0391	CALL	runtime.panicSliceAcap(SB)

This implicates the first slice operation---and means the line number is indeed off by one.

In any case, the frob decoder is clearly being fed an illegal short message that doesn't even contain the 4-byte magic number.

adonovan

adonovan commented on Jan 13, 2025

@adonovan
MemberAuthor

This looks like a consequence of execution continuing after the failed bug.Reportf in typerefData:

// typerefData retrieves encoded typeref data from the filecache, or computes it on
// a cache miss.
func (s *Snapshot) typerefData(ctx context.Context, id PackageID, imports map[ImportPath]*metadata.Package, cgfs []file.Handle) ([]byte, error) {
	key := typerefsKey(id, imports, cgfs)
	if data, err := filecache.Get(typerefsKind, key); err == nil {
		return data, nil
	} else if err != filecache.ErrNotFound {
		bug.Reportf("internal error reading typerefs data: %v", err) /// <-- should return an error
	}
...

The bug.Reportf should be turned into return bug.Errorf.

[Edit: nope, the code here looks fine. An unexpected error is just a cache miss. The problem must come either from Get returning a tiny or empty file and no error.]

self-assigned this
on Jan 13, 2025
gopherbot

gopherbot commented on Jan 13, 2025

@gopherbot
Contributor

Change https://go.dev/cl/642435 mentions this issue: gopls/internal/cache: Snapshot.typerefData: add missing error return

added
BugReportIssues describing a possible bug in the Go implementation.
on Jan 13, 2025
adonovan

adonovan commented on Jan 13, 2025

@adonovan
MemberAuthor

This is a really weird one. What we know:

  • Snapshot.typerefData returns an empty or short (<4) data and no error.
  • This must have come from filecache.Get, or from typerefs.Encode.
  • typerefs.Encode cannot return a string that doesn't have the magic number as a prefix, so it cannot be that.
  • filecache.Get can succeed either from the LRU cache, or from the complete sequence of I/O steps. The LRU cache doesn't seem to have any opportunity to truncate good data; therefore the I/O sequence must have returned a short/empty file.
  • Get's I/O sequence fails if the digest of the file content doesn't match the file name; so the short/empty file must have been written as such. Faulty storage hardware or external user meddling (e.g. deletion, renaming, edits, or truncation) could not cause this situation.

The only explanations I can think of are:

  • a hardware fault corrupted the slice after loading it. Seems very unlikely.
  • someone went to some trouble to intentionally creating a corresponding index/CAS file pair. Who would do that?

What am I missing?

findleyr

findleyr commented on Jan 13, 2025

@findleyr
Member

Skimming filecache.gc, it doesn't look like we care to delete the casfile before the content file. Therefore, doesn't it seem possible that we could read the content file concurrent to a deletion?

EDIT: err, I suppose the order of deletion doesn't really matter: the fact is we could try to read a content file concurrent with another process GCing that file.

EDIT2: nevermind, we'd fail the hash comparison, of course. OK, I agree that I don't understand this.

findleyr

findleyr commented on Jan 15, 2025

@findleyr
Member

I don't think we can fix this for v0.18.0, nor do I think we should avoid it.

Moving this to the backlog. We can revisit if there are additional occurrences.

adonovan

adonovan commented on Jan 22, 2025

@adonovan
MemberAuthor

I've refined the failure (to an explicit panic). Not sure what more to do for now.

added
gopls/memory-corruption"can't happen" gopls crashes (races, unsafe, miscompile, runtime bugs, faulty HW)
on Apr 10, 2025
removed their assignment
on Apr 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugReportIssues describing a possible bug in the Go implementation.NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.ToolsThis label describes issues relating to any tools in the x/tools repository.goplsIssues related to the Go language server, gopls.gopls/memory-corruption"can't happen" gopls crashes (races, unsafe, miscompile, runtime bugs, faulty HW)gopls/telemetry-wins

    Type

    Projects

    No projects

    Development

    No branches or pull requests

      Participants

      @adonovan@gopherbot@findleyr@gabyhelp

      Issue actions

        x/tools/gopls: out-of-bounds slice panic in bug in frob.(*reader).bytes · Issue #71244 · golang/go