Skip to content

cmd/compile: recent allocs regression #16897

Closed
@josharian

Description

@josharian

Going from commit c85b77c (just before the introduction of a printer type to cmd/compile/internal/gc/fmt.go) to 0d23c28 (current tip), I see:

name       old time/op      new time/op      delta
Template        297ms ± 6%       296ms ± 5%    ~           (p=1.000 n=14+13)
Unicode         150ms ± 4%       155ms ±10%    ~           (p=0.068 n=13+14)
GoTypes         965ms ± 4%       986ms ± 1%  +2.16%        (p=0.017 n=15+13)

name       old alloc/op     new alloc/op     delta
Template       47.7MB ± 0%      47.2MB ± 0%  -1.11%        (p=0.000 n=15+15)
Unicode        37.9MB ± 0%      37.9MB ± 0%  +0.08%        (p=0.000 n=15+13)
GoTypes         143MB ± 0%       145MB ± 0%  +1.39%        (p=0.000 n=15+15)

name       old allocs/op    new allocs/op    delta
Template         450k ± 0%        463k ± 0%  +3.00%        (p=0.000 n=15+15)
Unicode          368k ± 0%        374k ± 0%  +1.63%        (p=0.000 n=15+15)
GoTypes         1.34M ± 0%       1.45M ± 0%  +8.58%        (p=0.000 n=15+15)

I haven't dug into why, but I suspect that it is because formatting routines that used to simply return a constant string now append that constant string to a byte slice and then convert that byte slice back into a string. If this guess is correct, it could probably be reversed by adding a string field to printer and doing some careful special-casing code in the printer methods.

cc @griesemer @mdempsky

Activity

added this to the Go1.8 milestone on Aug 27, 2016
griesemer

griesemer commented on Aug 27, 2016

@griesemer
Contributor

@josharian You're probably right but I suggest not jumping to fixing this right now. The fmt.go cleanup is not finished yet, it's just a start. Let's complete the transition and cleanups and then revisit.

josharian

josharian commented on Aug 27, 2016

@josharian
ContributorAuthor

Sounds good. Just wanted to flag it, since this is a known performance hot spot.

griesemer

griesemer commented on Aug 30, 2016

@griesemer
Contributor

@josharian How did you obtain the results above? I like to re-measure.

josharian

josharian commented on Aug 30, 2016

@josharian
ContributorAuthor
$ # Make sure you're using the tip compiler for this go get! It memorizes GOROOT.
$ go get -u rsc.io/compilebench
$ git checkout c85b77c
$ ./make.bash
$ compilebench -short -count 30 -alloc | tee before
$ git checkout master
$ ./make.bash
$ compilebench -short -count 30 -alloc | tee after
$ benchstat before after

I now get from that (with master at d6098e4):

name       old time/op      new time/op      delta
Template        312ms ±16%       291ms ± 9%   -6.57%        (p=0.000 n=30+28)
Unicode         163ms ±20%       157ms ± 9%     ~           (p=0.088 n=30+30)
GoTypes         1.05s ±21%       0.97s ± 5%   -7.24%        (p=0.000 n=30+26)
Compiler        5.67s ±10%       5.21s ± 7%   -8.12%        (p=0.000 n=30+26)

name       old user-ns/op   new user-ns/op   delta
Template   415user-ms ±13%  398user-ms ± 6%   -4.02%        (p=0.002 n=30+29)
Unicode    232user-ms ± 9%  228user-ms ± 3%     ~           (p=0.123 n=29+25)
GoTypes    1.37user-s ±16%  1.37user-s ± 3%   -0.04%        (p=0.030 n=30+27)
Compiler   7.55user-s ± 8%  7.22user-s ± 9%   -4.44%        (p=0.000 n=30+30)

name       old alloc/op     new alloc/op     delta
Template       46.8MB ± 0%      47.9MB ± 0%   +2.39%        (p=0.000 n=30+28)
Unicode        37.8MB ± 0%      38.2MB ± 0%   +1.00%        (p=0.000 n=29+30)
GoTypes         143MB ± 0%       147MB ± 0%   +3.13%        (p=0.000 n=30+30)
Compiler        681MB ± 0%       710MB ± 0%   +4.33%        (p=0.000 n=29+30)

name       old allocs/op    new allocs/op    delta
Template         444k ± 0%        500k ± 0%  +12.68%        (p=0.000 n=30+30)
Unicode          369k ± 0%        385k ± 0%   +4.55%        (p=0.000 n=30+29)
GoTypes         1.35M ± 0%       1.60M ± 0%  +18.43%        (p=0.000 n=30+30)
Compiler        5.63M ± 0%       6.49M ± 0%  +15.23%        (p=0.000 n=30+30)

The alloc numbers are the ones that I think are probably due to the gc/fmt changes. The CPU numbers may be because I'm running a browser, etc., but I hope they're real! :)

griesemer

griesemer commented on Aug 30, 2016

@griesemer
Contributor

@josharian Thanks. My pending change ( https://go-review.googlesource.com/28072 ) reduces the allocations again somewhat. Following the same script, I now get:

name       old time/op     new time/op     delta
Template       307ms ± 4%      315ms ± 4%   +2.55%        (p=0.000 n=29+29)
Unicode        164ms ± 4%      165ms ± 4%     ~           (p=0.057 n=30+30)
GoTypes        1.01s ± 3%      1.03s ± 3%   +1.72%        (p=0.000 n=30+30)
Compiler       5.49s ± 1%      5.62s ± 2%   +2.31%        (p=0.000 n=30+28)

name       old user-ns/op  new user-ns/op  delta
Template        397M ± 3%       406M ± 6%   +2.21%        (p=0.000 n=28+30)
Unicode         225M ± 4%       226M ± 3%     ~           (p=0.230 n=29+30)
GoTypes        1.31G ± 3%      1.34G ± 5%   +2.79%        (p=0.000 n=30+30)
Compiler       7.39G ± 2%      7.50G ± 2%   +1.43%        (p=0.000 n=30+29)

name       old alloc/op    new alloc/op    delta
Template      46.8MB ± 0%     47.5MB ± 0%   +1.48%        (p=0.000 n=29+28)
Unicode       37.8MB ± 0%     38.1MB ± 0%   +0.64%        (p=0.000 n=30+28)
GoTypes        143MB ± 0%      145MB ± 0%   +1.72%        (p=0.000 n=30+30)
Compiler       683MB ± 0%      706MB ± 0%   +3.31%        (p=0.000 n=30+29)

name       old allocs/op   new allocs/op   delta
Template        444k ± 0%       481k ± 0%   +8.38%        (p=0.000 n=30+30)
Unicode         369k ± 0%       379k ± 0%   +2.74%        (p=0.000 n=30+30)
GoTypes        1.35M ± 0%      1.50M ± 0%  +10.78%        (p=0.000 n=30+30)
Compiler       5.66M ± 0%      6.25M ± 0%  +10.31%        (p=0.000 n=29+29)
davecheney

davecheney commented on Aug 30, 2016

@davecheney
Contributor

Just as a data point. I've been keeping my benchmarks up to date and as of
this morning Go tip is slightly slower than 1.7, an average of no more than
2%.

On Wed, 31 Aug 2016, 04:05 Robert Griesemer notifications@github.com
wrote:

@josharian https://github.com/josharian Thanks. My pending change (
https://go-review.googlesource.com/28072 ) reduces the allocations again
somewhat. Following the same script, I now get:

name old time/op new time/op delta
Template 307ms ± 4% 315ms ± 4% +2.55% (p=0.000 n=29+29)
Unicode 164ms ± 4% 165ms ± 4% ~ (p=0.057 n=30+30)
GoTypes 1.01s ± 3% 1.03s ± 3% +1.72% (p=0.000 n=30+30)
Compiler 5.49s ± 1% 5.62s ± 2% +2.31% (p=0.000 n=30+28)

name old user-ns/op new user-ns/op delta
Template 397M ± 3% 406M ± 6% +2.21% (p=0.000 n=28+30)
Unicode 225M ± 4% 226M ± 3% ~ (p=0.230 n=29+30)
GoTypes 1.31G ± 3% 1.34G ± 5% +2.79% (p=0.000 n=30+30)
Compiler 7.39G ± 2% 7.50G ± 2% +1.43% (p=0.000 n=30+29)

name old alloc/op new alloc/op delta
Template 46.8MB ± 0% 47.5MB ± 0% +1.48% (p=0.000 n=29+28)
Unicode 37.8MB ± 0% 38.1MB ± 0% +0.64% (p=0.000 n=30+28)
GoTypes 143MB ± 0% 145MB ± 0% +1.72% (p=0.000 n=30+30)
Compiler 683MB ± 0% 706MB ± 0% +3.31% (p=0.000 n=30+29)

name old allocs/op new allocs/op delta
Template 444k ± 0% 481k ± 0% +8.38% (p=0.000 n=30+30)
Unicode 369k ± 0% 379k ± 0% +2.74% (p=0.000 n=30+30)
GoTypes 1.35M ± 0% 1.50M ± 0% +10.78% (p=0.000 n=30+30)
Compiler 5.66M ± 0% 6.25M ± 0% +10.31% (p=0.000 n=29+29)


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#16897 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAAcA4wBVxDV-IYGlha2D4Q0y53FbR7Oks5qlHAdgaJpZM4JunmF
.

griesemer

griesemer commented on Aug 30, 2016

@griesemer
Contributor

@davecheney I wouldn't be surprised if some of it is due to these changes. I'm playing a bit more with it but if I cannot make it net-negative, I'll undo all the fmt changes that didn't explicitly remove code.

davecheney

davecheney commented on Aug 30, 2016

@davecheney
Contributor

@robert I don't think the formatting changes had any effect for or against,
since I started looking a week or so ago go tip has been ~2% slower in my
compile benchmarks than 1.7.

On Wed, Aug 31, 2016 at 9:12 AM, Robert Griesemer notifications@github.com
wrote:

@davecheney https://github.com/davecheney I wouldn't be surprised if
some of it is due to these changes. I'm playing a bit more with it but if I
cannot make it net-negative, I'll undo all the fmt changes that didn't
explicitly remove code.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#16897 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAAcA4oqCSfQt8FStKvz0qKTh_VhCF5Sks5qlLjwgaJpZM4JunmF
.

josharian

josharian commented on Aug 30, 2016

@josharian
ContributorAuthor

If you give up on it, Robert, please let me know before reverting everything.

Dave, if you could identify when the 2% regression started, that'd be useful. But let's leave this bug specifically for the gc/fmt changes.

davecheney

davecheney commented on Aug 30, 2016

@davecheney
Contributor

Dave, if you could identify when the 2% regression started, that'd be
useful. But let's leave this bug specifically for the gc/fmt changes.

I'm going to assume it was when the dev.ssa branch was merged in. To be
clear, we're talking about this much

go version go1.7 linux/amd64

real 1m2.428s
user 3m5.858s
sys 0m11.943s

go version devel +b2e0e96 Tue Aug 30 23:10:43 2016 +0000 linux/amd64

real 1m4.310s
user 3m13.133s
sys 0m12.396s

I don't think it's worth alerting the national guard at this early stage of
the game.

On Wed, Aug 31, 2016 at 9:23 AM, Josh Bleecher Snyder <
notifications@github.com> wrote:

If you give up on it, Robert, please let me know before reverting
everything.

Dave, if you could identify when the 2% regression started, that'd be
useful. But let's leave this bug specifically for the gc/fmt changes.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#16897 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAAcA9EALy8TmF__dzd2vwnwxlLpllbWks5qlLtYgaJpZM4JunmF
.

griesemer

griesemer commented on Aug 30, 2016

@griesemer
Contributor

@josharian I'm trying to use fmt.Formatter instead of the local printer - that would be nicer and more flexible. So far it looks good.

griesemer

griesemer commented on Aug 31, 2016

@griesemer
Contributor

@josharian I have now implemented fmt.Formatters for all types handled by fmt.go and also converted all source code to use formatted printing. For instance, an sconv(s, FmtSign) for the format %v becomes just s with the format %+v (I'm going to make these formats %+S for better readability).

Running the above benchmark above against commit c85b77c and current tip (commit cd0ba4c, excl. my new changes) results in:

name       old time/op     new time/op     delta
Template       307ms ± 4%      301ms ± 5%   -1.84%        (p=0.003 n=29+29)
Unicode        164ms ± 4%      160ms ± 4%   -2.18%        (p=0.000 n=30+29)
GoTypes        1.01s ± 3%      1.01s ± 3%     ~           (p=0.398 n=30+30)
Compiler       5.49s ± 1%      5.47s ± 2%     ~           (p=0.181 n=30+29)

name       old alloc/op    new alloc/op    delta
Template      46.8MB ± 0%     47.2MB ± 0%   +0.98%        (p=0.000 n=29+30)
Unicode       37.8MB ± 0%     38.0MB ± 0%   +0.55%        (p=0.000 n=30+29)
GoTypes        143MB ± 0%      145MB ± 0%   +1.59%        (p=0.000 n=30+30)
Compiler       683MB ± 0%      700MB ± 0%   +2.42%        (p=0.000 n=30+30)

name       old allocs/op   new allocs/op   delta
Template        444k ± 0%       479k ± 0%   +7.86%        (p=0.000 n=30+30)
Unicode         369k ± 0%       378k ± 0%   +2.69%        (p=0.000 n=30+30)
GoTypes        1.35M ± 0%      1.50M ± 0%  +10.68%        (p=0.000 n=30+30)
Compiler       5.66M ± 0%      6.21M ± 0%   +9.72%        (p=0.000 n=29+30)

This is about what we had yesterday.

Running the above benchmark above against commit c85b77c and and my local tip (incl. my new changes) results in:

name       old time/op     new time/op     delta
Template       307ms ± 4%      304ms ± 3%    ~           (p=0.053 n=29+29)
Unicode        164ms ± 4%      160ms ± 8%  -2.34%        (p=0.000 n=30+30)
GoTypes        1.01s ± 3%      1.05s ± 3%  +3.60%        (p=0.000 n=30+29)
Compiler       5.49s ± 1%      5.52s ± 1%  +0.42%        (p=0.009 n=30+29)

name       old alloc/op    new alloc/op    delta
Template      46.8MB ± 0%     46.8MB ± 0%  +0.14%        (p=0.000 n=29+30)
Unicode       37.8MB ± 0%     37.9MB ± 0%  +0.24%        (p=0.000 n=30+30)
GoTypes        143MB ± 0%      144MB ± 0%  +0.50%        (p=0.000 n=30+30)
Compiler       683MB ± 0%      688MB ± 0%  +0.69%        (p=0.000 n=30+30)

name       old allocs/op   new allocs/op   delta
Template        444k ± 0%       471k ± 0%  +6.05%        (p=0.000 n=30+30)
Unicode         369k ± 0%       376k ± 0%  +2.00%        (p=0.000 n=30+30)
GoTypes        1.35M ± 0%      1.47M ± 0%  +8.53%        (p=0.000 n=30+30)
Compiler       5.66M ± 0%      6.07M ± 0%  +7.22%        (p=0.000 n=29+29)

This shows an improvement - albeit not as much as I hoped. But the changes should have removed many function calls to the various Xconv routines all over the place. There should also much fewer strings be allocated and concatenated.

For final comparison: Current master tip as above against my local tip:

name       old time/op     new time/op     delta
Template       301ms ± 5%      304ms ± 3%    ~           (p=0.110 n=29+29)
Unicode        160ms ± 4%      160ms ± 8%    ~           (p=0.769 n=29+30)
GoTypes        1.01s ± 3%      1.05s ± 3%  +3.82%        (p=0.000 n=30+29)
Compiler       5.47s ± 2%      5.52s ± 1%  +0.82%        (p=0.000 n=29+29)

name       old alloc/op    new alloc/op    delta
Template      47.2MB ± 0%     46.8MB ± 0%  -0.83%        (p=0.000 n=30+30)
Unicode       38.0MB ± 0%     37.9MB ± 0%  -0.31%        (p=0.000 n=29+30)
GoTypes        145MB ± 0%      144MB ± 0%  -1.07%        (p=0.000 n=30+30)
Compiler       700MB ± 0%      688MB ± 0%  -1.69%        (p=0.000 n=30+30)

name       old allocs/op   new allocs/op   delta
Template        479k ± 0%       471k ± 0%  -1.67%        (p=0.000 n=30+30)
Unicode         378k ± 0%       376k ± 0%  -0.68%        (p=0.000 n=30+30)
GoTypes        1.50M ± 0%      1.47M ± 0%  -1.94%        (p=0.000 n=30+30)
Compiler       6.21M ± 0%      6.07M ± 0%  -2.28%        (p=0.000 n=30+29)

I'm going to send out a series of CLs (about one dozen in total) with the respective changes.

gopherbot

gopherbot commented on Sep 12, 2016

@gopherbot
Contributor

CL https://golang.org/cl/29087 mentions this issue.

griesemer

griesemer commented on Sep 13, 2016

@griesemer
Contributor

For the record: With (pending) https://go-review.googlesource.com/29089, we are now at or below the original levels:

name       old time/op     new time/op     delta
Template       297ms ± 5%      284ms ± 3%  -4.53%  (p=0.000 n=27+29)
Unicode        159ms ± 5%      151ms ± 5%  -4.91%  (p=0.000 n=28+30)
GoTypes        985ms ± 5%      935ms ± 2%  -5.13%  (p=0.000 n=28+29)

name       old alloc/op    new alloc/op    delta
Template      46.8MB ± 0%     45.7MB ± 0%  -2.37%  (p=0.000 n=30+30)
Unicode       37.8MB ± 0%     37.9MB ± 0%  +0.29%  (p=0.000 n=29+30)
GoTypes        143MB ± 0%      138MB ± 0%  -3.64%  (p=0.000 n=29+30)

name       old allocs/op   new allocs/op   delta
Template        444k ± 0%       440k ± 0%  -0.94%  (p=0.000 n=30+30)
Unicode         369k ± 0%       369k ± 0%  +0.19%  (p=0.000 n=29+30)
GoTypes        1.35M ± 0%      1.34M ± 0%  -1.24%  (p=0.000 n=30+30)

(Most of the benefits came from https://go-review.googlesource.com/29087.)

Considering this particular issue fully resolved at this point.

josharian

josharian commented on Sep 13, 2016

@josharian
ContributorAuthor

Thanks, Robert.

gopherbot

gopherbot commented on Sep 13, 2016

@gopherbot
Contributor

CL https://golang.org/cl/29089 mentions this issue.

added a commit that references this issue on Sep 13, 2016
locked and limited conversation to collaborators on Sep 13, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @davecheney@josharian@griesemer@gopherbot

        Issue actions

          cmd/compile: recent allocs regression · Issue #16897 · golang/go