Group reference types in binary format. Fixes #4773 #4774

kripken · 2022-07-06T20:36:58Z

See #4773 for details.

Existing tests have enough coverage for this, see the very last changed
output.

aheejin · 2022-07-06T23:13:42Z

Is putting reference types before MVP types better than the other way around? It is good for binary size if we place frequently used locals in lower numbers, and I kind of assumed MVP locals are accessed more frequently than the reference type ones. Is my assumption incorrect?

Speaking of the frequency, I think I remember we used to have a routine to sort locals based on the frequency with which they appear within the function, but I can't seem to find that anymore. Did we remove it?

kripken · 2022-07-07T00:00:30Z

Is putting reference types before MVP types better than the other way around?

Hmm, good question. I'd guess that in a wasm GC program the reference types are more common than anything else. At least that's the case for Java. That's probably why it seemed natural to me to do so in this PR 😄 But in a C++ program that uses reference types a little (like in the future hypothetical C++/GC integration) probably it would be the opposite. Might also be the case in Rust's current reference types integration (I think they add a little reference types late in wasm-bindgen).

It might be best to just count and see which is more common, and put that one first. I pushed that now, how does that sound?

Speaking of the frequency, I think I remember we used to have a routine to sort locals based on the frequency with which they appear within the function, but I can't seem to find that anymore.

I think that's this:

https://github.com/WebAssembly/binaryen/blob/main/src/passes/ReorderLocals.cpp

It is still used. The binary format grouping overrides it, though, so the first sort only matters within each type.

aheejin · 2022-07-07T00:29:03Z

Is putting reference types before MVP types better than the other way around?

Hmm, good question. I'd guess that in a wasm GC program the reference types are more common than anything else. At least that's the case for Java. That's probably why it seemed natural to me to do so in this PR 😄 But in a C++ program that uses reference types a little (like in the future hypothetical C++/GC integration) probably it would be the opposite. Might also be the case in Rust's current reference types integration (I think they add a little reference types late in wasm-bindgen).

It might be best to just count and see which is more common, and put that one first. I pushed that now, how does that sound?

Not sure if the number of reference type locals is a good heuristic for if each of them is used frequently.. It can be like there are a large number of reference types and each of them is used a handful of times.

I'm actually quite surprised that reference types are used more in local.set/get in Java programs... Do you mean the average number of usage of a reftype local is greater than the average number of usage of a MVP type local? I believe it is certainly possible that the sum of the number of usage of all reftype locals can be greater than that of all MVP locals, but I would be quite surprised if a single reftype local is used more than a single MVP local in average, because the number of reftypes will be much greater than the number of MVP types.

It's not that I'm suggesting we scan the whole function, which is likely to be overkill.. I'm mostly curious, and I think it may not be a big deal either way.

Speaking of the frequency, I think I remember we used to have a routine to sort locals based on the frequency with which they appear within the function, but I can't seem to find that anymore.

I think that's this:

https://github.com/WebAssembly/binaryen/blob/main/src/passes/ReorderLocals.cpp

It is still used. The binary format grouping overrides it, though, so the first sort only matters within each type.

Ah right.. I think I confused that with something else.

tlively

Code LGTM, but I don't understand why this is likely to be beneficial since I haven't caught up on the relevant discussions yet.

tlively · 2022-07-07T15:53:07Z

test/typed-function-references.wast

@@ -41,4 +41,31 @@
      )
    )
  )
+  (func $more-ref-types


Can we port this test to lit while we're modifying it?

I think it's better to do such porting separately. Porting it in this PR would make the diff less obvious.

I could create a new lit test for the new functions here I guess, though when we port this we'd want to merge them... but I don't feel strongly on that?

+1 on porting separately. Actually it's not trivial to port all files in test/ because it tests not only wasm-opt but also roundtripping through binary and dwarf info. I think it'd be better to be done separately.

Porting in separate PRs sounds good, but it would still be nice to port tests as we touch them (where it's not too difficult to do so) so that we can keep making progress here.

It's not trivial to port these tests because these tests test not a single thing but many things, including roundtripping.

Also I think it's better for tests of a PR to have changes showing the what that PR does. If we port the whole test in the same PR, the whole test will be shown as the diff, and it is hard to figure out which part of the test has changed in the PR.

I do agree we should find a way to make progress on porting these tests. Perhaps it could be part of the sheriff rotation? If the sheriff isn't busy with breakages perhaps they could port some tests.

The various test configurations aren't a big issue IMO because you can add multiple RUN lines (e.g. one with --roundtrip and one without) and have the auto-update script take care of the rest. Personally I would be happier with gradual porting as we touch files than with having the sheriff work on porting because the gradual porting requires fewer context switches from other ongoing work.

I agree it's good to put more efforts in gradual porting. But even when doing gradual porting I personally prefer to make it a separate PR, unless that's not too difficult. I think that's easier for review too, because we can review only the test changes the functionality PR, and when we review the porting PR we are aware that the PR does not change any contents, so we can only check the RUN lines and a few other things.

Right, ideally I would like a separate preliminary or follow-on PR porting the test whenever one of these tests is modified.

kripken · 2022-07-07T17:38:03Z

@aheejin

Not sure if the number of reference type locals is a good heuristic for if each of them is used frequently.. It can be like there are a large number of reference types and each of them is used a handful of times.

Yeah, you're right... Ok, after thinking some more, given that the pass will have sorted locals by frequency of use, I think we can just look at the type of the first local. If it's a reference, we keep references first. That would definitely keep the most frequently used local at the very top, and I think would probably work well in most cases. I pushed that now, thoughts?

I'm actually quite surprised that reference types are used more in local.set/get in Java programs... Do you mean the average number of usage of a reftype local is greater than the average number of usage of a MVP type local? I believe it is certainly possible that the sum of the number of usage of all reftype locals can be greater than that of all MVP locals, but I would be quite surprised if a single reftype local is used more than a single MVP local in average, because the number of reftypes will be much greater than the number of MVP types.

I meant how many locals are declared. I'm less sure about how many times they are used.

In Java code at least most locals are references to Java objects it seems. Sometimes we load an MVP type from them, but that seems more rare (and often we load the value and use it without storing in a local).

kripken · 2022-07-07T17:56:46Z

Just to be safe I measured code size on GC code from Dart and Java (on code that doesn't use reference types this has no effect). This very very slightly decreases size on Dart and very very slightly increases it on Java - the effects are 0.01% (a hundredth of one percent) or less, which is basically noise. So it seems worth the potential benefit to baseline compilers as mentioned in the link at the top.

kripken added 2 commits July 6, 2022 13:16

Group reference types in binary format. Fixes #4773

0810be7

link

ed3b2ae

kripken requested review from tlively and aheejin July 6, 2022 20:36

kripken added 3 commits July 6, 2022 16:50

pick direction

e2e80b2

test

d6394b0

comment

ffc457a

tlively reviewed Jul 7, 2022

View reviewed changes

better

639f526

aheejin approved these changes Jul 7, 2022

View reviewed changes

kripken merged commit 9831b36 into main Jul 7, 2022

kripken deleted the sortref branch July 7, 2022 19:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Group reference types in binary format. Fixes #4773 #4774

Group reference types in binary format. Fixes #4773 #4774

Uh oh!

kripken commented Jul 6, 2022

Uh oh!

aheejin commented Jul 6, 2022 •

edited

Loading

Uh oh!

kripken commented Jul 7, 2022

Uh oh!

aheejin commented Jul 7, 2022 •

edited

Loading

Uh oh!

tlively left a comment

Uh oh!

tlively Jul 7, 2022

Uh oh!

kripken Jul 7, 2022

Uh oh!

aheejin Jul 7, 2022

Uh oh!

tlively Jul 7, 2022

Uh oh!

aheejin Jul 7, 2022 •

edited

Loading

Uh oh!

kripken Jul 7, 2022

Uh oh!

tlively Jul 7, 2022

Uh oh!

aheejin Jul 8, 2022 •

edited

Loading

Uh oh!

tlively Jul 8, 2022

Uh oh!

kripken commented Jul 7, 2022

Uh oh!

kripken commented Jul 7, 2022

Uh oh!

Uh oh!

Group reference types in binary format. Fixes #4773 #4774

Group reference types in binary format. Fixes #4773 #4774

Uh oh!

Conversation

kripken commented Jul 6, 2022

Uh oh!

aheejin commented Jul 6, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kripken commented Jul 7, 2022

Uh oh!

aheejin commented Jul 7, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tlively left a comment

Choose a reason for hiding this comment

Uh oh!

tlively Jul 7, 2022

Choose a reason for hiding this comment

Uh oh!

kripken Jul 7, 2022

Choose a reason for hiding this comment

Uh oh!

aheejin Jul 7, 2022

Choose a reason for hiding this comment

Uh oh!

tlively Jul 7, 2022

Choose a reason for hiding this comment

Uh oh!

aheejin Jul 7, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kripken Jul 7, 2022

Choose a reason for hiding this comment

Uh oh!

tlively Jul 7, 2022

Choose a reason for hiding this comment

Uh oh!

aheejin Jul 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tlively Jul 8, 2022

Choose a reason for hiding this comment

Uh oh!

kripken commented Jul 7, 2022

Uh oh!

kripken commented Jul 7, 2022

Uh oh!

Uh oh!

aheejin commented Jul 6, 2022 •

edited

Loading

aheejin commented Jul 7, 2022 •

edited

Loading

aheejin Jul 7, 2022 •

edited

Loading

aheejin Jul 8, 2022 •

edited

Loading