Description
It would be nice to optimize Write([]byte(stringVal))
to not copy the string value. This is normally safe because most Write
methods do not modify the byte slice passed in. In fact, the documentation for io.Writer
requires that the Write
method not modify the byte slice, although this is not (and can not be) enforced.
Here is how we can do this.
For each function and method that takes a parameter of slice type, record whether the slice's underlying array is modified. This will require generating annotations similar to the ones we generate for escape analysis. Assembly functions will require an annotation similar to //go:noescape. (Naturally, a slice that escapes must be treated as modified.)
For each call of the form F([]byte(stringVal))
, where we know that F
does not modify the slice, we can pass the string value directly. This would do essentially the same thing as the existing optimization for map lookups in which a []byte
is converted to a string
. This fixes direct calls, but of course the interesting cases all involve calls to methods of values of type io.Writer
.
For any type with a Write
method that does modify the slice, generate at compile time an additional Write·2
method that makes a copy of the slice and then calls the Write
method. The method is named Write·2
to ensure that it does not conflict with any user written method.
When converting any type to io.Writer
(or any interface type that inherits from io.Writer
), check for the presence of a Write·2
method. If that method exists, add it to the interface as an additional entry in the itab.
For any call to an interface method Write([]byte(stringVal))
, modify the call to call a special function doWrite
in the io package, without copying the string. doWrite
will check for a Write·2
method, and call it if it exists; that will cause the string to be copied as happens today. If the Write·2
method does not exist, which will be the normal case, doWrite
will call the Write
method as usual, knowing that the Write
method does not modify the slice.
Generalizing this to other methods is left as an exercise for the reader.
Activity
bradfitz commentedon Jan 27, 2017
I feel like there was a dup bug with the same plan but rejected because as I recall: nobody should be working with such large strings and stay in bytes if they want performance.
Was this motivated by something concrete?
ianlancetaylor commentedon Jan 27, 2017
There was no specific motivation; it was just an idea I had.
philhofer commentedon Jan 27, 2017
Your compiler annotation for
Write
sounds a lot like the compiler secretly interpretingWrite([]byte)
to beWrite(immutable []byte)
; the same could be done for other pointer arguments that are provably unmodified.Such an (internal) annotation would have additional benefits:
__attribute__((const))
and__attribute__((pure))
.dsnet commentedon Jan 27, 2017
This would remove the need for my attempted optimization in #13848, which tried to optimize string writing at the standard library level.
dsnet commentedon Jan 27, 2017
How does this work for nested calls of
io.Writer
?You do not know if
T.Write
mutatesp
or not because it depends on whethert.wr.Write
mutatesp
. To be conservative, the compiler will need to assume that it is mutated and then it will emitT.Write·2
, removing any benefit from the proposed optimization.ianlancetaylor commentedon Jan 27, 2017
@dsnet For a case like that you turn
T.Write·2
into a copy of the entire method, that passesp
along uninterpreted for possibly copying byt.wr.Write·2
.dsnet commentedon Jan 27, 2017
@ianlancetaylor, Doesn't that approach fail for this example (that nobody should ever write):
So
T.Write·2
is a copy ofT.Write
but the call tot.wr.Write
is replaced bydoWrite(t.wr, p)
. Ift.wr.Write
does mutatep
, then we are guaranteed that the "hello" string is not accidentally mutated (correct). However, thefmt.Println(p[0])
will not be able to observe the mutation thatt.wr.Write
caused because of the copy (wrong).I guess that this is solvable if
doWrite
was changed such that the new copy was known to the caller. Something likefunc doWrite(w io.Writer, p *[]byte)
ianlancetaylor commentedon Jan 27, 2017
You're right: in general, if the method does something with the buffer after passing it to
Write
, it may not be possible to apply this optimization to callers of that method.minux commentedon Jan 28, 2017
ALTree commentedon Feb 1, 2017
Related to #2205 (cmd/compile: read-only escape analysis and avoiding string <-> []byte copies).
zigo101 commentedon Jan 16, 2020
I create a thread on #golang-dev to discuss some possible optimizations for this.