Skip to content

cmd/compile: teach the compiler that arguments passed to {strings|bytes}.{Index*|Count|Has*} don't escape #25864

Closed
@valyala

Description

@valyala

The issue

Go tip doesn't know that string / []byte arguments passed to functions like strings.IndexByte don't escape. This leads to unnecessary allocation and copy in the following code:

package stringsIndex_test

import (
        "strings"
        "testing"
)


func BenchmarkStringsIndexByte(b *testing.B) {
        bb := []byte("foobar baz")
        b.ReportAllocs()
        for i := 0; i < b.N; i++ {
                // string(bb) unnecessarily allocates new string and copies bb contents to it.
                n := strings.IndexByte(string(bb), ' ')
                Sink += n
        }
}

var Sink int

Benchmark results indicate an unnecessary allocation:

BenchmarkStringsIndexByte 	50000000	        30.5 ns/op	      16 B/op	       1 allocs/op

The solution

Mark strings and bytes functions accepting string / []byte as go:noescape in order to eliminate unnecessary allocations in the code above.

Activity

added
NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.
on Jun 13, 2018
added this to the Go1.12 milestone on Jun 13, 2018
ianlancetaylor

ianlancetaylor commented on Jun 13, 2018

@ianlancetaylor
Contributor

CC @dr2chase

I don't think we want to start using go:noescape in places like this. But perhaps the compiler can figure it out.

valyala

valyala commented on Jun 13, 2018

@valyala
ContributorAuthor

I don't think we want to start using go:noescape in places like this. But perhaps the compiler can figure it out.

I think it will be too difficult figuring out this for functions written in assembly. Probably, it would be OK adding go:noescape for assembly-written functions?

valyala

valyala commented on Jun 13, 2018

@valyala
ContributorAuthor

This change may have significant positive impact on standard packages, since then escape analysis may prove that arguments to many functions written in Go that call strings.Index* and co don't escape.

ianlancetaylor

ianlancetaylor commented on Jun 13, 2018

@ianlancetaylor
Contributor

Yes, sorry, it's fine to use go:noescape for functions written in assembly.

mundaym

mundaym commented on Jun 13, 2018

@mundaym
Member

IndexByte is already marked with go:noescape:

//go:noescape
func IndexByte(b []byte, c byte) int

I suspect this is just a case where the string cast could be cleverer.

randall77

randall77 commented on Jun 13, 2018

@randall77
Contributor

I guess we could fix this, but why are you using strings.IndexByte(string(bb), ' ') instead of bytes.IndexByte(bb, ' ')?

valyala

valyala commented on Jun 15, 2018

@valyala
ContributorAuthor

IndexByte is already marked with go:noescape

Then the []byte -> string conversion with a memory allocation and a copy for non-escaped argument looks like a bug.

why are you using strings.IndexByte(string(bb), ' ') instead of bytes.IndexByte(bb, ' ')

This is just an artificial example :)
The strings.Index(string(b), stringPrefix) is more real-life. It may be used interchangeably with bytes.Index(b, []byte(stringPrefix)), since there is no a preferred code here.

quasilyte

quasilyte commented on Sep 19, 2018

@quasilyte
Contributor

The non-escaping string(b) still can allocate if it doesn't fit the small tmp buffer.
I think #27148 (comment) can be relevant here.

bytes.Index(b, []byte(stringPrefix))

This is a preferred way for now, since prefix is usually smaller than haystack itself, and it can probably fit fixes 32-byte stack allocated buffer.

quasilyte

quasilyte commented on Sep 19, 2018

@quasilyte
Contributor

Here is a simple benchmarking code:

package foo

import (
	"bytes"
	"strings"
	"testing"
)

var haystack = bytes.Repeat([]byte("a"), 100)
var needle = "aaa"

func BenchmarkStringsIndex(b *testing.B) {
	for i := 0; i < b.N; i++ {
		_ = strings.Index(string(haystack), needle)
	}
}

func BenchmarkBytesIndex(b *testing.B) {
	for i := 0; i < b.N; i++ {
		_ = bytes.Index(haystack, []byte(needle))
	}
}

The results:

BenchmarkStringsIndex-8 20000000  73.3 ns/op 112 B/op 1 allocs/op
BenchmarkBytesIndex-8   100000000 15.1 ns/op   0 B/op 0 allocs/op

I've written a simple check that detects such calls that can be replaced with bytes.Index to aid in optimization sessions of big projects (https://go-critic.github.io/overview#indexAlloc-ref).

added a commit that references this issue on Sep 19, 2018
gopherbot

gopherbot commented on Oct 30, 2018

@gopherbot
Contributor

Change https://golang.org/cl/146018 mentions this issue: strings: declare Index as noescape

randall77

randall77 commented on Oct 30, 2018

@randall77
Contributor

The CL I just mailed fixes strings.Index. I think all the other methods mentioned in the issue title are already ok. Let me know if that doesn't mesh with your understanding.

locked and limited conversation to collaborators on Oct 30, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    FrozenDueToAgeNeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @valyala@ianlancetaylor@quasilyte@randall77@gopherbot

        Issue actions

          cmd/compile: teach the compiler that arguments passed to {strings|bytes}.{Index*|Count|Has*} don't escape · Issue #25864 · golang/go