Skip to content

Future of wasm2js #1929

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
kripken opened this issue Feb 28, 2019 · 33 comments
Open

Future of wasm2js #1929

kripken opened this issue Feb 28, 2019 · 33 comments

Comments

@kripken
Copy link
Member

kripken commented Feb 28, 2019

I'd like to do some work on wasm2js. Specifically, the use case I have in mind is to integrate it with Emscripten + the LLVM wasm backend. Then we can use that, instead of the current fastcomp asm.js backend, as a solution for emitting non-wasm output. This would have several advantages:

  • Better code: Can benefit from LLVM backend opts, newer LLVM IR opts (since upstream is more up to date), Binaryen opts, and Emscripten wasm-specific opts (metadce). Will still be able to benefit from Emscripten asm.js opts, since those are in passes that run after the backend.
  • Faster build times: can optionally use wasm object files for fast linking.
  • No more separate bitcode or object files for two different targets.
  • Get rid of fastcomp and all the support code for it.

This doesn't need to emit valid asm.js since in practice almost all browsers with asm.js AOT also have wasm anyhow (in fact chrome shipped wasm before asm.js AOT; and firefox did have some releases with just asm.js, but even LTS has had wasm for a while now). So wasm2js is an option here.

Specific work I'd like to do:

  • Integrate wasm2js with emscripten.
  • Benchmarking and performance tuning. Should be no slower than current fastcomp asm.js.
  • Testing (emscripten test suite, almost all features should be supported) and fuzzing.

This may involve changes to the JS emitted by wasm2js, so I wanted to ask how much current wasm2js users care about the form of the output? I know the Rust people have been using wasm2js, but I heard recently they have plans to write something new (which made me sad to hear, but on the other hand fewer users may mean more flexibility in terms of how we evolve wasm2js). cc @fitzgen

@dschuff
Copy link
Member

dschuff commented Feb 28, 2019

+cc @juj

@kripken
Copy link
Member Author

kripken commented Mar 1, 2019

Some previous relevant discussion: emscripten-core/emscripten#8085

@kripken
Copy link
Member Author

kripken commented Mar 7, 2019

ccing wasm2js/wasm2asm authors for more visibility: @yurydelendik @alexcrichton @dcodeIO @froydnj @tlively

@dcodeIO
Copy link
Contributor

dcodeIO commented Mar 7, 2019

Does this imply that a compiler that doesn't use LLVM/Emscripten won't be able to output a JS version (we don't care about valid asm.js, just something JS) with just Binaryen / BinaryenModulePrintAsmjs anymore? That'd be sad.

@tlively
Copy link
Member

tlively commented Mar 7, 2019

I strongly support this direction for wasm2js. Replacing Fastcomp was exactly my goal when I worked on this as an intern in summer 2017, with the motivation of simplifying Rust's dependency on Emscripten.

@tlively
Copy link
Member

tlively commented Mar 7, 2019

@dcodeIO, if I understand correctly, that will still work but the JS you get may be different that what it is today. In fact the JS will be better because wasm2js will be feature-complete enough to support all emscripten tests.

@kripken
Copy link
Member Author

kripken commented Mar 7, 2019

Yeah, we definitely don't want to remove use cases people care about - @dcodeIO, thanks for mentioning that you use this code path.

How much do you care about the external API of the JS code emitted? I might want to change it in minor ways. (Aside from that, my plan is to just improve the quality of the JS emitted.)

@dcodeIO
Copy link
Contributor

dcodeIO commented Mar 7, 2019

Not really bound to the external API, as long as it can either be run directly or easily postprocessed. A WebAssembly-ish API, similar to how Wasm works in browsers, would be great though :)

@kripken
Copy link
Member Author

kripken commented Mar 8, 2019

Yeah, a wasm-ish API - almost like a polyfill - is what I was thinking too, heh.

@alexcrichton
Copy link
Contributor

This all sounds like a great idea to me! FWIW the idea that we might implement wasm2js in Rust was primarily motivated that its maintainership here seemed to be waning. If it picks up though we're happy to help.

I'd personally agree that asm.js isn't too important at this point for the reasons mentioned, and the only desire we'd have is that wasm2js emits an ES module (as it does today) for inclusion into apps. Eventually we'd like to include this at least as a default option (if not on by default) in pipelines like Webpack.

@kripken
Copy link
Member Author

kripken commented Mar 8, 2019

Thanks @alexcrichton, good to know! Yeah, I intend to make this a major focus for myself personally, and it will definitely be a high priority once emscripten depends on it as a fastcomp replacement.

I'll have to investigate the JS output format issue - for emscripten and AssemblyScript it seems like a "wasm polyfill" approach is better, and for you an ES6 module is. Probably we can implement one in terms of the other or something like that.

@bvibber
Copy link
Contributor

bvibber commented Mar 8, 2019

I have a couple concerns about the "wasm polyfill" approach:

  • Will this work with multiple modules? In ogv.js I load separate modules for each file type, and so may have multiple instances of different demuxers running in the same JS context. They need to not stomp on each other if they each load a polyfill for the WebAssembly namespace.
  • If the polyfill replaces the global WebAssembly object, other code that's loaded into the web app later may think WebAssembly is available and try to compile and run its own modules, which would presumably fail.

These can probably be resolved by allowing the polyfill to use a custom namespace, and maybe also emitting it as a separate file which can be loaded once.

Note also for my case, JS output mostly targets IE 11 and old versions of Safari and Edge, so I need to avoid ES6 modules.

@alexcrichton
Copy link
Contributor

@kripken that sounds great! And yeah definitely agreed that the output format isn't too too important in that we can translate one way or the other as necessary. I think the polyfill approach is probably more flexible because it's how wasm is always used at the fundamental level today!

@kripken
Copy link
Member Author

kripken commented Mar 11, 2019

Thanks @Brion, very good points. Yeah, the "polyfill" approach does want to affect the global scope. So it seems like building the polyfill as an extra optional layer on top of the other approach is the better way to do.

@kripken
Copy link
Member Author

kripken commented Apr 9, 2019

I'm starting on this work now. Anyone interested to review the patches?

@kripken
Copy link
Member Author

kripken commented Apr 9, 2019

Looks like the current output is close to an ES6 module (import, export, etc.). For a JS fallback though, we can't assume the VM is new enough to have ES6 module support (+ if it does, it anyhow likely has wasm anyway)? Can I de-ES6 that, or am I missing something?

@dcodeIO
Copy link
Contributor

dcodeIO commented Apr 9, 2019

According to these docs ES6 modules are still experimental in node, so I guess de-ES6-ing is reasonable.

@kripken
Copy link
Member Author

kripken commented Apr 10, 2019

@alexcrichton it looks like you added the ES6 module output stuff for wasm2js - do you remember why? I'd like to replace it for the reasons 2 comments back.

@kripken
Copy link
Member Author

kripken commented Apr 10, 2019

Are people using the C API call BinaryenModulePrintAsmjs? (@dcodeIO?)

@dcodeIO
Copy link
Contributor

dcodeIO commented Apr 10, 2019

Yes, that's behind --asmjsFile, -a currently. I don't know of anyone relying on it, though, except some of our own tests / the n-body benchmark for comparison.

@kripken
Copy link
Member Author

kripken commented Apr 11, 2019

Thanks @dcodeIO - I'll keep it working then, shouldn't be a problem.

@dcodeIO
Copy link
Contributor

dcodeIO commented Apr 11, 2019

I'm totally fine with renaming it or otherwise changing the API ofc, as long as it's still there :)

@alexcrichton
Copy link
Contributor

IIRC the ES6 output was added to align with the esm-integration proposal which defines how to view a wasm module as an ES module. I also figured it's really the only common unit of compatibility in the JS ecosystem, where if ES6 isn't used it's some invented module format which ES6 can compile down to.

Basically it was an attempt at being forward compatible with tooling, while also acknowledging that the output only really works in bundlers today and would require some form of external tooling to process it to be compatible with Node.js

@kripken
Copy link
Member Author

kripken commented Apr 11, 2019

I see, thanks @alexcrichton. Ok, if this is needed for bundlers then I guess we should keep it around. I'll add an option to emit another variant of the glue (will be easier to do that after my current refactoring).

kripken added a commit that referenced this issue Apr 11, 2019
Early work for #1929

* Leave core wasm module - the "asm.js function" - to Wasm2JSBuilder, and add Wasm2JSGlue which emits the code before and after that. Currently that's some ES6 code, but we may want to change that later.
* Add add AssertionEmitter class for the sole purpose of emitting modules + assertions for testing. This avoids some hacks from before like starting from index 1 (assuming the module at first position was already parsed and printed) and printing of the f32Equal etc. functions not at the very top (which was due to technical limitations before).

Logic-wise, there should be no visible change, except some whitespace and reodering, and that I made the exceptions print out the source of the assertion that failed from the wast:

-if (!check2()) fail2();
+if (!check2()) throw 'assertion failed: ( assert_return ( call add ( i32.const 1 ) ( i32.const 1 ) ) ( i32.const 2 ) )';

(fail2 etc. did not exist, and seems to just have given a unique number for each assertion?)
@kripken
Copy link
Member Author

kripken commented Apr 23, 2019

Ok, I'm practically done with correctness here - wasm2js passes the emscripten test suite at all opt levels, and the fuzzer didn't find anything overnight. Looking at optimizations now.

@kripken
Copy link
Member Author

kripken commented Jun 6, 2019

Ok, I'm basically done with wasm2js. It passes almost all tests (see exceptions below), and looks good on code size and perf - it's actually nicely smaller than emscripten's asm.js output in many cases!

Unhandled issues, that may be done as followups if there is need:

  • Massive switches lead to massively-nested blocks, which JS engines end up hitting parsing limits on. It's quite hard to optimize this, I did some work to pattern-match switches, but there are many variations. I'm not sure how common this is in real-world code, since the wasm backend does break up huge switches if it thinks it should (I only see breakage in the artificial test_bigswitch/test_biggerswitch tests).
  • Function name mangling: the wasm backend will lower the _ZN* mangling into human-readable names, but those then get mangled into JS, which makes them unreadable again (but in a different way). Options here might be to an option to stop the backend from doing this, or to actually implement a parser from the human-readable form here in binaryen into the _ZN* form (which sounds... bad).
  • No source map support. For debugging, wasm should be mostly ok, as this is just on dev machines.

@CryZe
Copy link

CryZe commented Jun 7, 2019

It looks like wasm2js now generates 114 MiB instead of 155 MiB of JS for my 2 MiB wasm file. If I run uglifyjs on it, it gets minified down to 12 MiB. That's still kind of unfortunate compared to the 4 MiB JS file that got emitted by emscripten. It's quite an improvement over earlier versions of wasm2js / wasm2asm though. However I was also not able to run uglifyjs with either --compress or --mangle as it completely crashes then on a stack overflow for the file.

I think most of the remaining unoptimized JS code is the fact that pretty long variable names are used. Maybe there is an option in wasm2js that I missed?

@kripken
Copy link
Member Author

kripken commented Jun 7, 2019

@CryZe wasm2js does accept optimization flags like other tools, so it's important to run it with something like -O3 or -Os, that can make a huge difference.

Aside from that, it's still good to run a normal JS minifer on it, emscripten uses its own, and optionally closure: https://github.com/emscripten-core/emscripten/blob/incoming/tools/shared.py#L2644 (Both of those minifers can scale up to massively large amounts of JS.)

Are you just running wasm2js directly yourself, and not using it from emscripten? Maybe we should improve the docs for that?

kripken added a commit to emscripten-core/emscripten that referenced this issue Jun 10, 2019
* disable the big switch and debug info tests there, see WebAssembly/binaryen#1929 (comment)

* error on wasm2js + source maps. fixes #8743
@hummeleBop
Copy link

I think asm.js is still relevant, there are several uses case.
The KaiOS platform is only able to run asm.js optimizations and cannot compile/run WebAssembly bytecode.
Also, it's a low-spec platform and asm.js would be useful to enhance the user experience.
Wasm2js should be able to produce asm.js compliant javascript from the MVP webassembly.

@kripken
Copy link
Member Author

kripken commented Jul 11, 2019

@hummeleBop it would be good to know more about KaiOS's status and plans, specifically when they intend to upgrade their JS VM. If you know, or you know someone that does, that would be very useful!

@ibaryshnikov
Copy link

ibaryshnikov commented Jul 16, 2019

@kripken some feedback about wasm2js in our app, tested in chrome

Speed
155ms handwritten js
120ms wasm2js latest (c7e9271, version_87))
120ms wasm2js latest, -O3
185ms wasm2js before (d8bcf64, v1.38.29)

Code size
346k latest
229k latest, -O3
517k v1.38.29

Details
webpack dev build (release build is somehow complicated, but I can send it later if there's a need)
rustc 1.36.0
wasm-bindgen 0.2.48
chrome 75.0.3770.100
wasm-pack 0.8.1

Additionally
wasm2js v1.38.29 prints error Unknown option '-O3' as well as Switching to "almost asm" mode, reason: grow_memory op
Pure wasm time is 55ms (note that js in dev mode and wasm in release mode)
If I use dynamic import to load the tested function, it seems slower (again, it may be related to a dev build, need to do a release one)

@kripken
Copy link
Member Author

kripken commented Jul 16, 2019

Interesting, thanks @ibaryshnikov! Overall the results look good I think.

Is anything used to minify the JS after wasm2js? A standard minifier like terser can improve it a lot (wasm2js doesn't focus on simple stuff normal minifiers do anyhow).

I think older wasm2js didn't have optimization flags yet, which is why there is Unknown option '-O3' there.

@ibaryshnikov
Copy link

@kripken without additional tools, just wasm2js. They'll go to webpack and got minified later

belraquib pushed a commit to belraquib/emscripten that referenced this issue Dec 23, 2020
* disable the big switch and debug info tests there, see WebAssembly/binaryen#1929 (comment)

* error on wasm2js + source maps. fixes emscripten-core#8743
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants