Description
What version of Go are you using (go version
)?
go version devel +0dc814c Sat Jun 30 01:04:30 2018 +0000 linux/amd64
Does this issue reproduce with the latest release?
Yes.
What operating system and processor architecture are you using (go env
)?
linux/amd64
What did you do?
Attempted to dereference a C struct with padding from Go with the memory sanitizer enabled:
// msan.go
package main
/*
#include <stdlib.h>
struct s {
int i;
char c;
};
struct s* mks(void) {
struct s* s = malloc(sizeof(struct s));
s->i = 0xdeadbeef;
s->c = 'n';
return s;
}
*/
import "C"
import "fmt"
func main() {
s := *C.mks()
fmt.Println(s.c)
}
I compiled with:
CC=clang-6.0 CXX=clang++-6.0 go build -msan -o msan msan.go
Upon execution, msan crashes spuriously:
~/go1.10/misc/cgo/testsanitizers/src/foo$ ./msan
Uninitialized bytes in __msan_check_mem_is_initialized at offset 5 inside [0x701000000000, 8)
==21637==WARNING: MemorySanitizer: use-of-uninitialized-value
#0 0x4e8b08 (/home/benesch/go1.10/misc/cgo/testsanitizers/src/foo/msan+0x4e8b08)
SUMMARY: MemorySanitizer: use-of-uninitialized-value (/home/benesch/go1.10/misc/cgo/testsanitizers/src/foo/msan+0x4e8b08)
Exiting
The problem appears to be that the Go instrumentation is coarser than the C instrumentation. Only bytes 0-5 are marked as initialized by C (bytes 6-8 are padding), but Go asks msan to verify that all 8 bytes are initialized when it stores s
.
In this particular example, there are two easy ways to soothe msan. The first is to remove the padding from the struct (e.g., struct s { int i; int c }
). The second is to access fields within the struct without storing it into a temporary (e.g., fmt.Println(C.mks().c)
). Neither of these "workarounds" are viable for real programs.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Activity
benesch commentedon Jul 1, 2018
Well, this patch makes the test pass but causes a slew of problems on real programs:
Hmm.
ALTree commentedon Jul 1, 2018
I assume this also happens with go1.10 (so it's not a recent regression)?
benesch commentedon Jul 1, 2018
Yep. Just verified that it happens with go1.10.3. Given that this case isn't tested by any of the sample programs in misc/cgo/testsanitizers I wouldn't be surprised if it's been broken since msan support was introduced.
benesch commentedon Jul 1, 2018
Oh boy. This doesn't seem particularly easy to fix. LLVM's msan transformation (https://github.com/llvm-mirror/llvm/blob/6fed3d8070003bd65e0af3b70aaa04f7ca12260e/lib/Transforms/Instrumentation/MemorySanitizer.cpp#L1431) is more nuanced than gc's. Specifically, LLVM doesn't consider a load or a store of uninitialized memory to be problematic. It's what you do afterwards that counts. Your program won't blow up until e.g. you perform arithmetic using uninitialized memory or index into an array using uninitialized memory.
ianlancetaylor commentedon Jul 1, 2018
Perhaps we could modify gc to only call msanread/msanwrite for non-padding bytes when copying struct types.
[-]runtime/msan: cannot handle structs with padding[/-][+]cmd/compile: msan cannot handle structs with padding[/+]benesch commentedon Jul 1, 2018
ianlancetaylor commentedon Jul 1, 2018
Sorry, I didn't even know the compiler added padding fields at all. In
widstruct
in cmd/compile/internal/gc/align.go it just sets offsets based on alignment, implicitly adding padding.benesch commentedon Jul 1, 2018
Oh! Perhaps it's cmd/cgo that's to blame. Thanks. I'll keep digging.
ianlancetaylor commentedon Jul 1, 2018
Ah, yes, cmd/cgo does add padding fields to the Go version of C structs in
(*typeconv).pad
in cmd/cgo/gcc.go.benesch commentedon Jul 1, 2018
Ok, great. Excluding the blank identifiers that cmd/cgo writes from msan instrumentation lets Cockroach get even further in boot.
In other news, I've stumbled across an edge case that I'm not sure what to make of. Consider a program that uses an ostensibly-initialized C struct as a Go map key:
Go will read the uninitialized padding bytes to hash the struct for the map key. Yikes! Though perhaps a valid answer here is "don't do that."
josharian commentedon Jul 1, 2018
I’m AFK but I believe Syms have an IsBlank method.
josharian commentedon Jul 1, 2018
The autogenerated hash and eq functions should ignore blank fields in structs. If they don’t, it’s a bug.
benesch commentedon Jul 2, 2018
Ah. Then perhaps
mapassign
and friends are simply reporting too coarsely to msan. Maybe here:go/src/runtime/map.go
Line 447 in 28f9b88
benesch commentedon Jul 2, 2018
Yep, thanks. I've included the patch I've been experimenting with below. (I'm sure there's somewhere better to put this code.)
josharian commentedon Jul 2, 2018
Looks like it. Perhaps there should be an msanreadtype that takes in type information and uses it to avoid reading blank fields.
Also, not that it matters for this conversation, but it appears to me (on my phone) that the msanread call there could be moved inside the “if map is nil” return path. Might improve perf.
The ideal way to share a patch is gerrit. Or a PR (which gets imported to gerrit).
benesch commentedon Jul 2, 2018
Yep, definitely an option. Or the calls to
msanread
could be moved closer to the actual memory reads in the hash functions.josharian commentedon Jul 2, 2018
I believe the msanread call is there only to handle the case in which we don’t call the hash function, which should have its own instrumentation.