Skip to content

runtime: append to nil slice slower than make and copy #14718

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
nvanbenschoten opened this issue Mar 8, 2016 · 4 comments
Closed

runtime: append to nil slice slower than make and copy #14718

nvanbenschoten opened this issue Mar 8, 2016 · 4 comments

Comments

@nvanbenschoten
Copy link
Contributor

Please answer these questions before submitting your issue. Thanks!

  • What version of Go are you using (go version)?
    go version go1.6 darwin/amd64
  • What operating system and processor architecture are you using (go env)?
    GOARCH="amd64" GOOS="darwin"
  • What did you do?
    (Use play.golang.org to provide a runnable example, if possible.)
package main

import (
        "testing"
)

var c =  []byte("1789678900001234567890")

func BenchmarkBBytes(b *testing.B) {
        for i := 0; i < b.N; i++ {
                d := make([]byte, len(c))
                copy(d, c)
        }
}

func BenchmarkBString(b *testing.B) {
        for i := 0; i < b.N; i++ {
                _ = append([]byte(nil), c...)
        }
}
  • What did you expect to see?

The two patterns would produce the same code and have the same performance.

  • What did you see instead?

Instead, the make and copy pattern (BenchmarkBBytes) seems to be faster by about 12 ns, or 16%. See cockroachdb/cockroach#4963.

BenchmarkBBytes-4   20000000            59.6 ns/op        32 B/op          1 allocs/op
BenchmarkBString-4  20000000            71.0 ns/op        32 B/op          1 allocs/op
@bradfitz
Copy link
Contributor

bradfitz commented Mar 8, 2016

/cc @randall77

@cespare
Copy link
Contributor

cespare commented Mar 8, 2016

On my machine, both benchmarks improve from 1.6 -> tip (SSA) but the gap between them remains.

name       old time/op  new time/op  delta
BBytes-4   82.1ns ± 1%  72.7ns ± 2%  -11.43%   (p=0.000 n=9+9)
BString-4  95.2ns ± 2%  90.7ns ± 2%   -4.76%  (p=0.000 n=9+10)

@randall77
Copy link
Contributor

The major reason is that when you do make([]byte, n), you get an allocation with len==cap==n. The runtime doesn't have to do much work to decide what to allocate and what to zero.

When you do append(nil, ..slice of n bytes..), you actually get more capacity. In this case, you get a slice with a length of 22 and a capacity of 32. The increased capacity is there to anticipate future appends. The runtime spends some additional time computing what the right capacity should be and initializing those extra bytes.

Also, the append path generally has more overhead, see #11419 for an example.

We should be able to fix some of this, but probably not all. For instance, there is a divide to compute how many extra elements we can fit when we round the allocation up to a size class (runtime/slice.go:92). That divide would be hard to get rid of, and probably accounts for a good chunk of the difference you're seeing.

@bradfitz
Copy link
Contributor

bradfitz commented Mar 8, 2016

Closing this bug, then. Sounds like it's working as expected.

@bradfitz bradfitz closed this as completed Mar 8, 2016
@golang golang locked and limited conversation to collaborators Mar 13, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants