Search code, repositories, users, issues, pull requests...

Collaborator

Introduce a new codegen mode, MINIMAL_JS perhaps, which we would design from scratch to be minimal

I think before we do this, there's a lot we can optimize with feature-specific flags and other methods. Not too long ago I wrote a tiny chess game in GLES2, at https://github.com/juj/tiny_chess, and looking at its build output, in one evening I was able to remove about 2/3rds of the runtime boilerplate as unnecessary. (from uncompressed 150KB down to uncompressed 50KB). There does exist a lot of low hanging fruit here, and I think it would be best to start splicing off these types of items on a feature basis, that way it won't be a wholesale "old runtime" or "new runtime" question. Longjmp support is one such example. Another thing is that I'd like to start slimming down individual items in Module object, which don't DCE too well.

MemberAuthor

Good points. Thinking on this some more, maybe we should clarify the target goal. Is it

Reduce the current JS code by a factor of 2, 3, 4..? That would probably still leave a multiple of 10K (uncompressed).
Have an easy way to get a truly minimal amount of JS code, really just code to load the wasm and provide minimal support, leading to something like a multiple of 1K.

I think the first goal is reachable with an incremental improvement approach, but to reach the second, starting "from scratch" in a mode where we add only necessary JS seems more likely to succeed.

jgravelle-google

Contributor

Crazy thought: if the goal is to have a minimal wasm hello world, can we change users' expectations around what that looks like? As a strawman:

#include <js_console.h>

int main() {
  console_log("Hello world!");
  return 0;
}

can generate

(import "env" "console_log" (func $console_log ...))
(data (i32.const 1024) ("Hello world!\00"))
(func $main (result i32)
  (call $console_log
    (i32.const 1024)
  )
  (i32.const 0)
)

Then the JS glue would just be Pointer_stringify and a wrapper around console.log.

The idea being, if people want minimal wasm, they're either doing something web-first, and/or they're experimenting with the wasm format, and C just happens to compile to that today. Supporting printf alone is a huge part of what makes our current hello world so large, between the libc functions it winds up including in the wasm, and the posix/syscall support in JS. If what people really want is "the ability to write to the console from C," then exposing the JS API as directly as possible is going to be the cheapest option.

MemberAuthor

Yeah, that's very relevant here - people that want minimal output should avoid printf etc. We do already have stuff for them, like emscripten_log, but it takes a simple printf-like format string for convenience (which brings in a few K of support code). Could be nice to add to the HTML5 API something that just gets a string, and so just wraps console.log, like emscripten_console_log as you suggest.

jgravelle-google

Contributor

I wonder if we could save code size by making a C++ API that does string conversion.

class Console {
  template <typename... Rest>
  void log(char* str, Rest... rest) {
    add_to_inner_buffer(str);
    log(rest...);
  }

  template <typename... Rest>
  void log(int x, Rest... rest) {
    add_to_inner_buffer(itoa(str));
    log(rest...);
  }

  void log() {
    console_log(inner_buffer);
    clear_inner_buffer();
  }
} console;

// Later
console.log("Foo = ", foo, ", bar = ", bar);

Anyway all that is to say that I think -s MINIMAL_WASM is worth investigating (I also think we should keep the wasm as small as possible as well as the js). It's a use case that we're seeing people want, that we don't have a great support story for.

curiousdannii

Contributor

If you call printf, that's going to take a lot of code. I wouldn't consider that overhead though, it's the true cost of a powerful C function. Of course it would be good to make it easier to use cheaper logging functions.

I don't think long term it would be helpful to have two modes. If a minimal JS mode is started it should eventually be the default and then the only mode.

Changes I think could help:

Not pull in the whole of Browser when only some is required (Separate Browser.mainLoop from the browser API functions #5355) (potential 30kb uncompressed saving)
Make file loading less accommodating. Use one set of helper functions for everything. Stick with only one method (fetch?). The user can add polyfills if needed. Be more opinionated about how files are loaded, ie, it's either inline, or it's loaded from one set location. If you want to differ from that you need to manually set an option to add the code to support it.
Similarly, rely on TextDecoder being present and the user can add polyfills if needed
Maybe use native promises rather than the runtime dependencies system? I imagine that a promise chain would be more straightforward and more concise.

Contributor

I did some experiments on this. The result is pretty long, so I put in a repo. Please see https://github.com/rongjiecomputer/minimal-emscripten

Demo is GameBoy emulator by @binji.

binji

Contributor

@rongjiecomputer Yeah, I specifically made sure that binjgb didn't use much host functionality (GL, audio, etc.) so it was easier to port. I'm not sure many developers will want to (or even be able to) do that. But it's worth exploring that direction -- I really wish I could have a minimal bindings layer like the one that you generated!

Perhaps it would be good to explore which use cases you want to support, and have that point you toward the goals. I can think of a few:

Someone who wants minimal JS glue, and will do everything themselves (this is me)
Someone porting an existing, non-graphical C/C++ library and wants to expose a convenient JS API for the web
Same as previous, but wants to expose the library for node.js
Someone writing new code that wants to use WebAssembly. They want to write mostly normal C++ but are OK using web specific libraries and avoiding POSIX cruft
Someone with an existing portable application, that just wants an easy way to port to the web (emscripten works pretty well for them currently)
Someone who is working on their own language, and wants to use WebAssembly as a backend. They need to write their own runtime that works with their language, but they don't want redo all the work done in library_*.js files.
Someone who has an existing web application and has pinpointed a hotspot in their code and want to optimize it. They want an easy way to port this code from JavaScript and drop it in.
Someone who isn't developing for the web, and just wants to use WebAssembly as a portable native executable format. They don't want any JavaScript at all, just the necessary information to create their own host-specific glue code.

And I'm sure there are many more! Thoughts?

Contributor

My experiment is basically option 2 mentioned by @kripken, that is, a new mode starts from nothing and slowly adds more things when needed.

A none user-facing breaking change is the use ES6 template string instead of hacky C preprocessor. This allows more complex operations in the template. The downside is library_*.js will have to be rewritten, porting all library_*.js to new template format might be infeasible. There will be two versions of library_*.js, I am not sure Emscripten will want that, so this experiment might never make it into Emscripten.

Someone who wants minimal JS glue, and will do everything themselves (this is me)

This is me as well. In my opinion, this offers the best performance due to less wrapper layers and temporary JS objects.

Someone porting an existing, non-graphical C/C++ library and wants to expose a convenient JS API for the web. Same as previous, but wants to expose the library for node.js.

Someone who has an existing web application and has pinpointed a hotspot in their code and want to optimize it. They want an easy way to port this code from JavaScript and drop it in.

I want to support this, but user is expected to write his own JS code to handle filesystem as well. Most big C/C++ projects like protobuf do use filesystem in some functions. User has to do extra work to make sure these functions are not compiled into .wasm.

Currently C++ exception is not supported as I have not implemented __cxa_allocate_exception etc. yet. Even the simplest C++ code might not work due to this. I am hesitating whether to implement the JS version now (which might be hard to implement, only to be abandoned when WebAssembly exception is released) or just wait for WebAssembly exception to be implemented in Chrome Canary (need to wait until 2018).

Currently my plan is let user call gen.js, gen.js will check if non-default values of GLOBAL_BASE, TOTAL_MEMORY and TOTAL_STACK are set, pass extra arguments to emcc.py, read last line of the generated .wast, then generate the JS code.

I have a final exam to prepare, so I am going to submerge myself for a while. I might still have time to participate in the discussion, but will only have time to work on this some time in December.

Collaborator

people that want minimal output should avoid printf etc. We do already have stuff for them, like emscripten_log

A nice way to do simple prints from C is

#include <emscripten.h>
int main()
{
  int foo = 42;
  EM_ASM(console.log('hello ' + $0), foo);
}

that is also size efficient (although could be even smaller if ASM_CONSTS array was optimized away when not needed.

Thinking on this some more, maybe we should clarify the target goal. Is it

Reduce the current JS code by a factor of 2, 3, 4..? That would probably still leave a multiple of 10K (uncompressed).
Have an easy way to get a truly minimal amount of JS code, really just code to load the wasm and provide minimal support, leading to something like a multiple of 1K.

if the goal is to have a minimal wasm hello world, can we change users' expectations around what that looks like? As a strawman:

I think we could do both; but what I'm saying is that we should have an extremely strong bias to doing the first item first until we run out of items to optimize, because refactoring >>> rewriting (the usual Joel on Software and Coding Horror articles etc.). Jumping on first foot forward to rewrite here would feel like a bad engineering call. I don't like the idea of having two runtimes and scenarios where we would end up with people considering "how do I do this in the new vs old runtime?", or "is this thing X still compatible with the new runtime?", or "will this thing X ever work in new runtime?". There is nothing fundamentally incompatible or impossible why the current runtime could not be DCEd one item at a time, except the engineering time to start looking at opportunities to optimize.

Different directions are raising these kind of "hands up" looks with "what if we just started over with the runtime?", and those kinds of thoughts come mainly from not understanding why the undesired lines exist in the runtime, when they are needed, what problems they are there for to solve, and what the path would be to optimizing them away. Sure, nobody likes having to deal with overhead from other developers' problems, that is understandable. As first party developers of all these features that go into the runtime, we do have that knowledge, so we should be able to cut those items down one at a time.

Taking a peek at building the above code example and looking at the output, there are a lot of content there that we have an opportunity to implement better DCE machinery for. If we "started over" but with no better DCE machinery than we currently have, we would still be lacking the capability to DCE, and would end up in a similar situation eventually, where we don't have the means to DCE undesired code away, and will then start to implement the exact DCE methods that should probably have been built in the first place. Having flags -s SUPPORT_ENVIRONMENT_IS_WORKER=0/1 as manual DCE methods does not sound like a bad idea as long as they are understandable feature-based (rather than -s NEW_RUNTIME=0/1); putting preamble.js on diet, migrating functions from there to --js-libraries, and using perhaps smaller feature-focused slices such as -s POSIX_PROCESS=1 to control whether to emit process argc,argv and exit() support, -s ENABLE_CCALL=1 for emitting cwrap()/ccall() and similar would be great. That way developers will have a chance to tune the existing output to which features they will need, and we won't create a rift between developers wanting to access features in old vs new. The var Runtime object has some fields that look quite archaic that can be removed.

Then after we have slimmed down all that we have any chance to do, we can look at what remains and figure out why those lines are so fundamentally difficult that they could not become manually exportable in some fashion.

I could be wrong in that the engineering effort to do the above might prove to be too much, that we won't be able to pull it off. Though I think that would be the path to best success that would allow all features to be used - compartmentalize different features in boxes, then either automatically have the compiler choose which boxes are used, or if not possible, do it manually, so that developers have the options to exactly pay only for which features they use.

Collaborator

Then the JS glue would just be Pointer_stringify

Oh, on this specific note, I want to kill Pointer_stringify in favor of the more explicit UTF8ToString.

binji

Contributor

@juj I think your suggestion of modularizing the current runtime is a good one! I agree that starting over from scratch is probably the wrong way to go. But I'm also a bit concerned about the idea of adding more -s flags to make it happen. IMO the thing that is both nice and annoying about using emscripten is that it tries to figure out what you want and provides opt-out compiler flags otherwise. It means that the runtime is provided privileged information, so it is difficult for a developer to craft their own. I'd much rather see the layers that the standard runtime provides are using a well-defined interface that can be easily replaced by the developer. This "keeps you honest" in terms of what is possible, and provides the opportunity for all of the users I listed in my comment above to be able to use emscripten the way they want.

MemberAuthor

@rongjiecomputer - thanks for sharing that experiment! Very interesting and helpful to think about this.

I think that experiment shows it is possible to offer a "minimal JS" runtime option. The details are tricky, though - I would suggest different stuff be enabled in that mode than in the experiment ;) - which perhaps proves one of @juj's points.

@binji - what type of interfaces do you mean here? And what type of use cases do you have in mind for developers replacing parts of the runtime?

binji

Contributor

what type of interfaces do you mean here?

Basically just imports/exports to each layer. The compiler provides an initial layer, and each additional layer depends on a previous one. Should be analogous to a standard module system, though probably won't be exactly that. So something like cwrap depends on very little, and something like the filesystem layer depends on more, and the IndexedDB layer even more, etc. It seems like you have a lot of this behavior already implemented w/ the --js-library stuff, perhaps it can just extend to more of the runtime, and maybe it can be more granular.

And what type of use cases do you have in mind for developers replacing parts of the runtime?

Well, I gave some examples about some use cases above, but personally I don't need a lot of the features, so all I want is enough to get the C/C++ code off the ground (setup linear memory, run static initializers, etc) then allow me to call into the module. I typically won't use many C library or POSIX features (probably just printf) and will instead plumb through my own functions.

Contributor

@juj I also agree that starting over from scratch may be bad because we will need up supporting two runtime versions, adding to the already heavy maintenance burden.

Due to lack of DCE and the fact that everything is exposed in global, whatever we do here will most likely break someone's code.

But I'm also a bit concerned about the idea of adding more -s flags to make it happen.

I share the same concern. For new users, they won't know all these flags and completely puzzled why their hello world is large even though they don't use ccall etc..

Any thoughts about the possibility of using ES6 template string as template engine instead of C preprocessor and {{{ }}}?

Advantage: Python script will only need to provide Settings.* to Node.js script and let Node.js do the actual template generation. We might be able to do more DCE.
Disadvantage: Massive refactoring of Python and Node.js scripts. library_*.js also needs to be rewritten. Can't be done in a single commit.

Some of the things I want to kill:

STACK_ALIGN, QUANTUM_SIZE.
A lot of things from Runtime,
loadWebAssemblyModule,integrateWasmJS (We should just provide importObj and let users load/compile/instantiate/run WebAssembly themselves at their chosen time).
Module.{preRun,postRun,preMain,callMain,run,addOnPreRun,addOnExit,...} (since users will be running code themselves)
Module["asm"] (ditto, users get functions directly from instance.exports)
addRunDependency etc..
All Math polyfills.

9 remaining items

mentioned this

on Nov 22, 2017

Investigate ES6 modules as our internal components #5828

MemberAuthor

After some offline discussion with @lukewagner I opened #5828 for discussion of the modularization approach.

For this issue, I think we should make sure that what we decide makes sense with the long-term goal of having ES6-module-based output, as mentioned there:

Option 1 (incremental shrinking, putting stuff behind flags, etc.) is mostly orthogonal to that goal. It might help in that while putting things behind a flag they are more separable, so easier to put into a literal ES6 module later.
Option 2 (new codegen mode for minimal JS) might be more tied to modularization, since the new codegen mode could be a step towards ES6 modules. So far the discussion into that option hasn't gotten into details related to that, though, so it's also currently orthogonal.

Contributor

For now, I can accept the plan to just guard more components with -s, starting from the easier ones like Math polyfill, TypedArray detecting etc..

I like the idea of --export=ccall --export=cwrap (or --export=ccall,cwrap,preRun) instead of -s EXTRA_RUNTIME_FUNCTIONS_TO_EXPORT=['ccall'].

Any thoughts about the possibility of using ES6 template string as template engine instead of C preprocessor and {{{ }}}?

I think "in addition to", rather than "instead", except unless this can be proven to be superior in practice? We will migrate to latest Node.js LTS in emsdk in next tagged release, so this will be available at least out of the box.

"In addition to" sounds good. runtime.js definitely can use some cleanup with ES6 template string, then more possible cleanups can be explored.

Collaborator

There will be some breakage. We can mitigate so that it is minimally surprising, but users may need to change some things. E.g. we may export fewer things by default, like ccall/cwrap in the example above.
How much breakage we accept will depend on code size wins. I think we should set a target for the size JS we emit (for, say, hello world), and that should be a factor in deciding if a particular breakage is worth it. (I don't know what the size target should be.)

I think rather than "how much code size it wins", I see it as "how does one deal with the breakage?". There are a lot of different types of breakage, and something we analyze constantly when communicating to external parties. When there's an easy "you'll start to get a compilation/runtime error about X, so then do Y" model to follow, it's not much of an issue to disrupt users, since the path to action is clear. Even if you don't save too many characters with this, it's easy to justify if it simplifies, since the action to resolve is easy.

A more difficult one is when there's a bidirectional breakage: old code will not compile/work on a newer compiler version, and the fixed code will not compile/work on an older compiler version. We had this with the Wasm debug table formats change. This is much more annoying than the above, because one can't write any kind of "one ideal form of code" that would be compatible with different versions. We want to avoid bidirectional breakages whenever possible, since this impacts distribution update paths in ecosystems. If it's not possible, then we should aim to be diligent to identify where such bidirectional breakages lie, so that those will be easy to discover.

Removing the Module.{preRun,postRun,preMain,callMain,run,addOnPreRun,addOnExit,...} architecture is probably next on the list in terms of difficulty. Completely doable, and as a feature I think well "blocked out" into its own box to refactor. However this will need explainers, migration guides, or similar that instruct developers how to proceed. There is no love lost if we deleted these altogether at some point, for example in favor of the now newer Promise.then mechanisms, but we need to make sure to help people have a path to migrating to this.

Then there are the changes where we know something will break, but do not want to think about how it will manifest, which existing features are (in)compatible with the changes, how to migrate, or debug. These are in the red flag zone - users will get angry if they will need to research new breakages after someone else's PR lands and if they discover the breakage was not an accident but intended.

Shrinking code size and ES6 modularization are very orthogonal, we can certainly add ES6 module structure even without putting any effort in to shrinking code at all. Both of these features can be worked in parallel as well.

MemberAuthor

@juj - good points, agreed. I'd add that I think many of the changes here could either have compile-time error messages, or if not, then at least run-time errors, something like this:

An existing thing (ccall, dynCall_*, etc.) is no longer exported by default on Module (so our DCE can remove it automatically from the final JS).
When building with ASSERTIONS, we emit a stub for the thing we removed (on Module, in these examples), basically an assertion that it is never called, saying you need to export it if you want to use it. This increases code size in ASSERTIONS mode, but that's ok. And we already encourage people to use that mode when things don't work well, so there is a reasonable path for users to get a clear error message telling them what to do.
We add tests checking that exporting it works, and that using it without an export in ASSERTIONS mode shows the message.
As mentioned in the previous comment, we mention these changes on the mailing list and in the docs, etc.

I think doing that (+ compile-time errors when possible etc.) would reasonably mitigate the breaking changes we are proposing here.

mentioned this

on Nov 22, 2017

JS and wasm shrinking (testcase #1) #5836

MemberAuthor

Opened #5836 with some data on the JS we emit for one testcase, and a list of tasks for it.

mentioned this

on Nov 27, 2017

Run multiple iterations of JSDCE, as more may be removable #5855

floooh

Collaborator

Interesting thread! I share the concern about adding more -s flags. Main reasons:

it leaks out into the (and basically requires a) build system, which in general is the 'accepted' way of things, but adds a layer of complexity (a good example is marking a function inline as EMSCRIPTEN_KEEPALIVE vs the -s EXPORTED_FUNCTIONS, the EMSCRIPTEN_KEEPALIVE way is infinitely better. A similar example is Visual Studio's ```#pragma comment(lib, "xxx")" versus adding a linker command line option
especially smaller demos often don't even need a build system, if header-only libs are used (like the examples here: https://github.com/floooh/sokol-samples/tree/master/html5) , even non-trivial programs can be built directly with emcc bla.c -o bla.html -O3, adding a lot of options to get the smallest possible build would make this much less appealing

I think the ability to build without a build system is especially important for beginners (in the sense of "people new to WASM/asm.js", or when building WASM modules for an app that's primarily written in JS.

So from my point of view, a way to move some of these configuration options into the source code would be highly appreciated (although I have no good answers of how to achieve this, but EMSCRIPTEN_KEEPALIVE or the custom #pragmas point into the right direction).

mentioned this

on Nov 30, 2017

NO_EXIT_RUNTIME by default #5878