runtime: append to nil slice slower than make and copy #14718

nvanbenschoten · 2016-03-08T19:17:10Z

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?
go version go1.6 darwin/amd64
What operating system and processor architecture are you using (go env)?
GOARCH="amd64" GOOS="darwin"
What did you do?
(Use play.golang.org to provide a runnable example, if possible.)

package main

import (
        "testing"
)

var c =  []byte("1789678900001234567890")

func BenchmarkBBytes(b *testing.B) {
        for i := 0; i < b.N; i++ {
                d := make([]byte, len(c))
                copy(d, c)
        }
}

func BenchmarkBString(b *testing.B) {
        for i := 0; i < b.N; i++ {
                _ = append([]byte(nil), c...)
        }
}

What did you expect to see?

The two patterns would produce the same code and have the same performance.

What did you see instead?

Instead, the make and copy pattern (BenchmarkBBytes) seems to be faster by about 12 ns, or 16%. See cockroachdb/cockroach#4963.

BenchmarkBBytes-4   20000000            59.6 ns/op        32 B/op          1 allocs/op
BenchmarkBString-4  20000000            71.0 ns/op        32 B/op          1 allocs/op

The text was updated successfully, but these errors were encountered:

bradfitz · 2016-03-08T19:20:46Z

/cc @randall77

cespare · 2016-03-08T19:43:30Z

On my machine, both benchmarks improve from 1.6 -> tip (SSA) but the gap between them remains.

name       old time/op  new time/op  delta
BBytes-4   82.1ns ± 1%  72.7ns ± 2%  -11.43%   (p=0.000 n=9+9)
BString-4  95.2ns ± 2%  90.7ns ± 2%   -4.76%  (p=0.000 n=9+10)

randall77 · 2016-03-08T23:00:09Z

The major reason is that when you do make([]byte, n), you get an allocation with len==cap==n. The runtime doesn't have to do much work to decide what to allocate and what to zero.

When you do append(nil, ..slice of n bytes..), you actually get more capacity. In this case, you get a slice with a length of 22 and a capacity of 32. The increased capacity is there to anticipate future appends. The runtime spends some additional time computing what the right capacity should be and initializing those extra bytes.

Also, the append path generally has more overhead, see #11419 for an example.

We should be able to fix some of this, but probably not all. For instance, there is a divide to compute how many extra elements we can fit when we round the allocation up to a size class (runtime/slice.go:92). That divide would be hard to get rid of, and probably accounts for a good chunk of the difference you're seeing.

bradfitz · 2016-03-08T23:13:34Z

Closing this bug, then. Sounds like it's working as expected.

nvanbenschoten mentioned this issue Mar 8, 2016

roachpb: Use make and copy pattern for high profile byte slice copies cockroachdb/cockroach#4963

Merged

bradfitz added the Performance label Mar 8, 2016

bradfitz added this to the Unplanned milestone Mar 8, 2016

bradfitz closed this as completed Mar 8, 2016

martisch mentioned this issue Mar 10, 2016

cmd/compile: optimize append performance #14758

Closed

golang locked and limited conversation to collaborators Mar 13, 2017

gopherbot added the FrozenDueToAge label Mar 13, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

runtime: append to nil slice slower than make and copy #14718

runtime: append to nil slice slower than make and copy #14718

nvanbenschoten commented Mar 8, 2016

bradfitz commented Mar 8, 2016

cespare commented Mar 8, 2016

randall77 commented Mar 8, 2016

bradfitz commented Mar 8, 2016

runtime: append to nil slice slower than make and copy #14718

runtime: append to nil slice slower than make and copy #14718

Comments

nvanbenschoten commented Mar 8, 2016

bradfitz commented Mar 8, 2016

cespare commented Mar 8, 2016

randall77 commented Mar 8, 2016

bradfitz commented Mar 8, 2016