Skip to content

cmd/compile: msan cannot handle structs with padding #26167

Open
@benesch

Description

@benesch

What version of Go are you using (go version)?

go version devel +0dc814c Sat Jun 30 01:04:30 2018 +0000 linux/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

linux/amd64

What did you do?

Attempted to dereference a C struct with padding from Go with the memory sanitizer enabled:

// msan.go

package main

/*
#include <stdlib.h>

struct s {
 	int i;
	char c;
};

struct s* mks(void) {
	struct s* s = malloc(sizeof(struct s));
	s->i = 0xdeadbeef;
	s->c = 'n';
	return s;
}
*/
import "C"
import "fmt"

func main() {
	s := *C.mks()
	fmt.Println(s.c)
}

I compiled with:

CC=clang-6.0 CXX=clang++-6.0 go build -msan -o msan msan.go

Upon execution, msan crashes spuriously:

~/go1.10/misc/cgo/testsanitizers/src/foo$ ./msan
Uninitialized bytes in __msan_check_mem_is_initialized at offset 5 inside [0x701000000000, 8)
==21637==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x4e8b08  (/home/benesch/go1.10/misc/cgo/testsanitizers/src/foo/msan+0x4e8b08)

SUMMARY: MemorySanitizer: use-of-uninitialized-value (/home/benesch/go1.10/misc/cgo/testsanitizers/src/foo/msan+0x4e8b08) 
Exiting

The problem appears to be that the Go instrumentation is coarser than the C instrumentation. Only bytes 0-5 are marked as initialized by C (bytes 6-8 are padding), but Go asks msan to verify that all 8 bytes are initialized when it stores s.

In this particular example, there are two easy ways to soothe msan. The first is to remove the padding from the struct (e.g., struct s { int i; int c }). The second is to access fields within the struct without storing it into a temporary (e.g., fmt.Println(C.mks().c)). Neither of these "workarounds" are viable for real programs.

Activity

benesch

benesch commented on Jul 1, 2018

@benesch
ContributorAuthor

Well, this patch makes the test pass but causes a slew of problems on real programs:

--- a/src/cmd/compile/internal/gc/ssa.go
+++ b/src/cmd/compile/internal/gc/ssa.go
@@ -600,7 +600,13 @@ func (s *state) instrument(t *types.Type, addr *ssa.Value, wr bool) {
        var fn *obj.LSym
        needWidth := false
 
-       if flag_msan {
+       if flag_msan && t.Etype == TSTRUCT {
+               for _, f := range t.FieldSlice() {
+                       addr = s.newValue1I(ssa.OpOffPtr, types.NewPtr(t), f.Offset, addr)
+                       s.instrument(f.Type, addr, wr)
+               }
+               return
+       } else if flag_msan {
                fn = msanread
                if wr {
                        fn = msanwrite

Hmm.

added
NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.
on Jul 1, 2018
added this to the Go1.12 milestone on Jul 1, 2018
ALTree

ALTree commented on Jul 1, 2018

@ALTree
Member

I assume this also happens with go1.10 (so it's not a recent regression)?

benesch

benesch commented on Jul 1, 2018

@benesch
ContributorAuthor

Yep. Just verified that it happens with go1.10.3. Given that this case isn't tested by any of the sample programs in misc/cgo/testsanitizers I wouldn't be surprised if it's been broken since msan support was introduced.

benesch

benesch commented on Jul 1, 2018

@benesch
ContributorAuthor

Oh boy. This doesn't seem particularly easy to fix. LLVM's msan transformation (https://github.com/llvm-mirror/llvm/blob/6fed3d8070003bd65e0af3b70aaa04f7ca12260e/lib/Transforms/Instrumentation/MemorySanitizer.cpp#L1431) is more nuanced than gc's. Specifically, LLVM doesn't consider a load or a store of uninitialized memory to be problematic. It's what you do afterwards that counts. Your program won't blow up until e.g. you perform arithmetic using uninitialized memory or index into an array using uninitialized memory.

ianlancetaylor

ianlancetaylor commented on Jul 1, 2018

@ianlancetaylor
Contributor

Perhaps we could modify gc to only call msanread/msanwrite for non-padding bytes when copying struct types.

changed the title [-]runtime/msan: cannot handle structs with padding[/-] [+]cmd/compile: msan cannot handle structs with padding[/+] on Jul 1, 2018
benesch

benesch commented on Jul 1, 2018

@benesch
ContributorAuthor
ianlancetaylor

ianlancetaylor commented on Jul 1, 2018

@ianlancetaylor
Contributor

Sorry, I didn't even know the compiler added padding fields at all. In widstruct in cmd/compile/internal/gc/align.go it just sets offsets based on alignment, implicitly adding padding.

benesch

benesch commented on Jul 1, 2018

@benesch
ContributorAuthor

Oh! Perhaps it's cmd/cgo that's to blame. Thanks. I'll keep digging.

ianlancetaylor

ianlancetaylor commented on Jul 1, 2018

@ianlancetaylor
Contributor

Ah, yes, cmd/cgo does add padding fields to the Go version of C structs in (*typeconv).pad in cmd/cgo/gcc.go.

benesch

benesch commented on Jul 1, 2018

@benesch
ContributorAuthor

Ok, great. Excluding the blank identifiers that cmd/cgo writes from msan instrumentation lets Cockroach get even further in boot.

In other news, I've stumbled across an edge case that I'm not sure what to make of. Consider a program that uses an ostensibly-initialized C struct as a Go map key:

// msan.go

package main

/*
#include <stdlib.h>

struct s {
 	int i;
	char c;
};

struct s* mks(void) {
	struct s* s = malloc(sizeof(struct s));
	s->i = 0xdeadbeef;
	s->c = 'n';
	return s;
}
*/
import "C"

func main() {
	m := map[C.struct_s]struct{}{}
	m[*C.mks()] = struct{}{}
}

Go will read the uninitialized padding bytes to hash the struct for the map key. Yikes! Though perhaps a valid answer here is "don't do that."

josharian

josharian commented on Jul 1, 2018

@josharian
Contributor

I’m AFK but I believe Syms have an IsBlank method.

josharian

josharian commented on Jul 1, 2018

@josharian
Contributor

Go will read the uninitialized padding bytes to hash the struct for the map key.

The autogenerated hash and eq functions should ignore blank fields in structs. If they don’t, it’s a bug.

benesch

benesch commented on Jul 2, 2018

@benesch
ContributorAuthor

The autogenerated hash and eq functions should ignore blank fields in structs. If they don’t, it’s a bug.

Ah. Then perhaps mapassign and friends are simply reporting too coarsely to msan. Maybe here:

msanread(key, t.key.size)

benesch

benesch commented on Jul 2, 2018

@benesch
ContributorAuthor

I’m AFK but I believe Syms have an IsBlank method.

Yep, thanks. I've included the patch I've been experimenting with below. (I'm sure there's somewhere better to put this code.)

diff --git a/src/cmd/compile/internal/gc/ssa.go b/src/cmd/compile/internal/gc/ssa.go
index ff2b93d..234954d 100644
--- a/src/cmd/compile/internal/gc/ssa.go
+++ b/src/cmd/compile/internal/gc/ssa.go
@@ -600,6 +600,16 @@ func (s *state) instrument(t *types.Type, addr *ssa.Value, wr bool) {
        var fn *obj.LSym
        needWidth := false
 
+       if flag_msan && t.IsStruct() {
+               for _, f := range t.FieldSlice() {
+                       if !f.Sym.IsBlank() {
+                               faddr := s.newValue1I(ssa.OpOffPtr, f.Type.PtrTo(), f.Offset, addr)
+                               s.instrument(f.Type, faddr, wr)
+                       }
+               }
+               return
+       }
+
        if flag_msan {
                fn = msanread
                if wr {
josharian

josharian commented on Jul 2, 2018

@josharian
Contributor

Then perhaps mapassign and friends are simply reporting too coarsely to msan

Looks like it. Perhaps there should be an msanreadtype that takes in type information and uses it to avoid reading blank fields.

Also, not that it matters for this conversation, but it appears to me (on my phone) that the msanread call there could be moved inside the “if map is nil” return path. Might improve perf.

The ideal way to share a patch is gerrit. Or a PR (which gets imported to gerrit).

benesch

benesch commented on Jul 2, 2018

@benesch
ContributorAuthor

Looks like it. Perhaps there should be an msanreadtype that takes in type information and uses it to avoid reading blank fields.

Yep, definitely an option. Or the calls to msanread could be moved closer to the actual memory reads in the hash functions.

josharian

josharian commented on Jul 2, 2018

@josharian
Contributor

I believe the msanread call is there only to handle the case in which we don’t call the hash function, which should have its own instrumentation.

modified the milestones: Go1.12, Go1.13 on Feb 12, 2019
modified the milestones: Go1.13, Go1.14 on Jul 8, 2019
modified the milestones: Go1.14, Backlog on Oct 9, 2019
moved this to Triage Backlog in Go Compiler / Runtimeon Jul 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.compiler/runtimeIssues related to the Go compiler and/or runtime.

    Type

    No type

    Projects

    Status

    Triage Backlog

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @josharian@rsc@andybons@benesch@ianlancetaylor

        Issue actions

          cmd/compile: msan cannot handle structs with padding · Issue #26167 · golang/go