Description
We've historically focused on full support of existing C/C++ code by default, which leads to us emitting a bunch of things that increase the default JS code size, like
- longjmp and exceptions support (dynCall/invoke glue)
- atexit support
- Filesystem support
- Various POSIX things (like
/dev/random
). - Ability to run on the web, node.js, various js shells
- ccall, string stuff, etc. utilities for JS convenience
- etc.
As a result the JS emitted for "hello world" is not tiny.
In a medium or large project the compiled code is the large bulk anyhow and we have very good code size optimization there. But people do notice the JS size on small programs, and with wasm increasing the interest in compiling to the web, this has been coming up.
I've been thinking that dynamic linking might get us there, as we emit no JS for a standalone wasm dynamic library (SIDE_MODULE). However, dynamic libraries add relocation (unnecessary for many things) and we don't necessarily want 0 JS, we want "minimal" JS. So dynamic libraries are not the solution here.
Two other possible paths we could go down:
- Audit the current JS output and fix things one by one. That is, look at what we emit, and starting from the larger things, find out if we can't optimize it out or put it behind a flag, etc. To really make progress that way, we may need breaking changes, but e.g. putting longjmp support behind a flag could be ok as long as we issue a clear message for developers (
your program uses longjmp, you need to compile with -s LONGJMP_SUPPORT=1
). - Introduce a new codegen mode,
MINIMAL_JS
perhaps, which we would design from scratch to be minimal - we'd start from nothing and add just the things we want in that mode. It wouldn't support things like a filesystem or POSIX or atexit etc. (and we'd need to decide on exceptions and longjmp, asm.js or just wasm, etc.). We'd point people to the "normal" non-minimal JS for those things.
Thoughts?
Activity
juj commentedon Nov 16, 2017
I think before we do this, there's a lot we can optimize with feature-specific flags and other methods. Not too long ago I wrote a tiny chess game in GLES2, at https://github.com/juj/tiny_chess, and looking at its build output, in one evening I was able to remove about 2/3rds of the runtime boilerplate as unnecessary. (from uncompressed 150KB down to uncompressed 50KB). There does exist a lot of low hanging fruit here, and I think it would be best to start splicing off these types of items on a feature basis, that way it won't be a wholesale "old runtime" or "new runtime" question. Longjmp support is one such example. Another thing is that I'd like to start slimming down individual items in
Module
object, which don't DCE too well.kripken commentedon Nov 17, 2017
Good points. Thinking on this some more, maybe we should clarify the target goal. Is it
I think the first goal is reachable with an incremental improvement approach, but to reach the second, starting "from scratch" in a mode where we add only necessary JS seems more likely to succeed.
jgravelle-google commentedon Nov 17, 2017
Crazy thought: if the goal is to have a minimal wasm hello world, can we change users' expectations around what that looks like? As a strawman:
can generate
Then the JS glue would just be
Pointer_stringify
and a wrapper aroundconsole.log
.The idea being, if people want minimal wasm, they're either doing something web-first, and/or they're experimenting with the wasm format, and C just happens to compile to that today. Supporting
printf
alone is a huge part of what makes our current hello world so large, between the libc functions it winds up including in the wasm, and the posix/syscall support in JS. If what people really want is "the ability to write to the console from C," then exposing the JS API as directly as possible is going to be the cheapest option.kripken commentedon Nov 17, 2017
Yeah, that's very relevant here - people that want minimal output should avoid printf etc. We do already have stuff for them, like
emscripten_log
, but it takes a simple printf-like format string for convenience (which brings in a few K of support code). Could be nice to add to the HTML5 API something that just gets a string, and so just wrapsconsole.log
, likeemscripten_console_log
as you suggest.jgravelle-google commentedon Nov 17, 2017
I wonder if we could save code size by making a C++ API that does string conversion.
Anyway all that is to say that I think
-s MINIMAL_WASM
is worth investigating (I also think we should keep the wasm as small as possible as well as the js). It's a use case that we're seeing people want, that we don't have a great support story for.curiousdannii commentedon Nov 18, 2017
If you call printf, that's going to take a lot of code. I wouldn't consider that overhead though, it's the true cost of a powerful C function. Of course it would be good to make it easier to use cheaper logging functions.
I don't think long term it would be helpful to have two modes. If a minimal JS mode is started it should eventually be the default and then the only mode.
Changes I think could help:
rongjiecomputer commentedon Nov 19, 2017
I did some experiments on this. The result is pretty long, so I put in a repo. Please see https://github.com/rongjiecomputer/minimal-emscripten
Demo is GameBoy emulator by @binji.
binji commentedon Nov 19, 2017
@rongjiecomputer Yeah, I specifically made sure that binjgb didn't use much host functionality (GL, audio, etc.) so it was easier to port. I'm not sure many developers will want to (or even be able to) do that. But it's worth exploring that direction -- I really wish I could have a minimal bindings layer like the one that you generated!
Perhaps it would be good to explore which use cases you want to support, and have that point you toward the goals. I can think of a few:
library_*.js
files.And I'm sure there are many more! Thoughts?
rongjiecomputer commentedon Nov 20, 2017
My experiment is basically option 2 mentioned by @kripken, that is, a new mode starts from nothing and slowly adds more things when needed.
A none user-facing breaking change is the use ES6 template string instead of hacky C preprocessor. This allows more complex operations in the template. The downside is
library_*.js
will have to be rewritten, porting alllibrary_*.js
to new template format might be infeasible. There will be two versions oflibrary_*.js
, I am not sure Emscripten will want that, so this experiment might never make it into Emscripten.This is me as well. In my opinion, this offers the best performance due to less wrapper layers and temporary JS objects.
I want to support this, but user is expected to write his own JS code to handle filesystem as well. Most big C/C++ projects like
protobuf
do use filesystem in some functions. User has to do extra work to make sure these functions are not compiled into.wasm
.Currently C++ exception is not supported as I have not implemented
__cxa_allocate_exception
etc. yet. Even the simplest C++ code might not work due to this. I am hesitating whether to implement the JS version now (which might be hard to implement, only to be abandoned when WebAssembly exception is released) or just wait for WebAssembly exception to be implemented in Chrome Canary (need to wait until 2018).Currently my plan is let user call
gen.js
,gen.js
will check if non-default values ofGLOBAL_BASE
,TOTAL_MEMORY
andTOTAL_STACK
are set, pass extra arguments toemcc.py
, read last line of the generated.wast
, then generate the JS code.I have a final exam to prepare, so I am going to submerge myself for a while. I might still have time to participate in the discussion, but will only have time to work on this some time in December.
juj commentedon Nov 20, 2017
A nice way to do simple prints from C is
that is also size efficient (although could be even smaller if
ASM_CONSTS
array was optimized away when not needed.I think we could do both; but what I'm saying is that we should have an extremely strong bias to doing the first item first until we run out of items to optimize, because refactoring >>> rewriting (the usual Joel on Software and Coding Horror articles etc.). Jumping on first foot forward to rewrite here would feel like a bad engineering call. I don't like the idea of having two runtimes and scenarios where we would end up with people considering "how do I do this in the new vs old runtime?", or "is this thing X still compatible with the new runtime?", or "will this thing X ever work in new runtime?". There is nothing fundamentally incompatible or impossible why the current runtime could not be DCEd one item at a time, except the engineering time to start looking at opportunities to optimize.
Different directions are raising these kind of "hands up" looks with "what if we just started over with the runtime?", and those kinds of thoughts come mainly from not understanding why the undesired lines exist in the runtime, when they are needed, what problems they are there for to solve, and what the path would be to optimizing them away. Sure, nobody likes having to deal with overhead from other developers' problems, that is understandable. As first party developers of all these features that go into the runtime, we do have that knowledge, so we should be able to cut those items down one at a time.
Taking a peek at building the above code example and looking at the output, there are a lot of content there that we have an opportunity to implement better DCE machinery for. If we "started over" but with no better DCE machinery than we currently have, we would still be lacking the capability to DCE, and would end up in a similar situation eventually, where we don't have the means to DCE undesired code away, and will then start to implement the exact DCE methods that should probably have been built in the first place. Having flags
-s SUPPORT_ENVIRONMENT_IS_WORKER=0/1
as manual DCE methods does not sound like a bad idea as long as they are understandable feature-based (rather than-s NEW_RUNTIME=0/1
); puttingpreamble.js
on diet, migrating functions from there to --js-libraries, and using perhaps smaller feature-focused slices such as-s POSIX_PROCESS=1
to control whether to emit processargc
,argv
andexit()
support,-s ENABLE_CCALL=1
for emittingcwrap()/ccall()
and similar would be great. That way developers will have a chance to tune the existing output to which features they will need, and we won't create a rift between developers wanting to access features in old vs new. Thevar Runtime
object has some fields that look quite archaic that can be removed.Then after we have slimmed down all that we have any chance to do, we can look at what remains and figure out why those lines are so fundamentally difficult that they could not become manually exportable in some fashion.
I could be wrong in that the engineering effort to do the above might prove to be too much, that we won't be able to pull it off. Though I think that would be the path to best success that would allow all features to be used - compartmentalize different features in boxes, then either automatically have the compiler choose which boxes are used, or if not possible, do it manually, so that developers have the options to exactly pay only for which features they use.
juj commentedon Nov 20, 2017
Oh, on this specific note, I want to kill
Pointer_stringify
in favor of the more explicitUTF8ToString
.binji commentedon Nov 20, 2017
@juj I think your suggestion of modularizing the current runtime is a good one! I agree that starting over from scratch is probably the wrong way to go. But I'm also a bit concerned about the idea of adding more
-s
flags to make it happen. IMO the thing that is both nice and annoying about using emscripten is that it tries to figure out what you want and provides opt-out compiler flags otherwise. It means that the runtime is provided privileged information, so it is difficult for a developer to craft their own. I'd much rather see the layers that the standard runtime provides are using a well-defined interface that can be easily replaced by the developer. This "keeps you honest" in terms of what is possible, and provides the opportunity for all of the users I listed in my comment above to be able to use emscripten the way they want.kripken commentedon Nov 20, 2017
@rongjiecomputer - thanks for sharing that experiment! Very interesting and helpful to think about this.
I think that experiment shows it is possible to offer a "minimal JS" runtime option. The details are tricky, though - I would suggest different stuff be enabled in that mode than in the experiment ;) - which perhaps proves one of @juj's points.
@binji - what type of interfaces do you mean here? And what type of use cases do you have in mind for developers replacing parts of the runtime?
binji commentedon Nov 21, 2017
Basically just imports/exports to each layer. The compiler provides an initial layer, and each additional layer depends on a previous one. Should be analogous to a standard module system, though probably won't be exactly that. So something like cwrap depends on very little, and something like the filesystem layer depends on more, and the IndexedDB layer even more, etc. It seems like you have a lot of this behavior already implemented w/ the
--js-library
stuff, perhaps it can just extend to more of the runtime, and maybe it can be more granular.Well, I gave some examples about some use cases above, but personally I don't need a lot of the features, so all I want is enough to get the C/C++ code off the ground (setup linear memory, run static initializers, etc) then allow me to call into the module. I typically won't use many C library or POSIX features (probably just printf) and will instead plumb through my own functions.
rongjiecomputer commentedon Nov 21, 2017
@juj I also agree that starting over from scratch may be bad because we will need up supporting two runtime versions, adding to the already heavy maintenance burden.
Due to lack of DCE and the fact that everything is exposed in global, whatever we do here will most likely break someone's code.
I share the same concern. For new users, they won't know all these flags and completely puzzled why their hello world is large even though they don't use
ccall
etc..Any thoughts about the possibility of using ES6 template string as template engine instead of C preprocessor and
{{{ }}}
?library_*.js
also needs to be rewritten. Can't be done in a single commit.Some of the things I want to kill:
STACK_ALIGN
,QUANTUM_SIZE
.Runtime
,loadWebAssemblyModule,integrateWasmJS
(We should just provideimportObj
and let users load/compile/instantiate/run WebAssembly themselves at their chosen time).Module.{preRun,postRun,preMain,callMain,run,addOnPreRun,addOnExit,...}
(since users will be running code themselves)Module["asm"]
(ditto, users get functions directly frominstance.exports
)addRunDependency
etc..Math
polyfills.9 remaining items
kripken commentedon Nov 22, 2017
After some offline discussion with @lukewagner I opened #5828 for discussion of the modularization approach.
For this issue, I think we should make sure that what we decide makes sense with the long-term goal of having ES6-module-based output, as mentioned there:
rongjiecomputer commentedon Nov 22, 2017
For now, I can accept the plan to just guard more components with
-s
, starting from the easier ones like Math polyfill, TypedArray detecting etc..I like the idea of
--export=ccall --export=cwrap
(or--export=ccall,cwrap,preRun
) instead of-s EXTRA_RUNTIME_FUNCTIONS_TO_EXPORT=['ccall']
."In addition to" sounds good.
runtime.js
definitely can use some cleanup with ES6 template string, then more possible cleanups can be explored.juj commentedon Nov 22, 2017
I think rather than "how much code size it wins", I see it as "how does one deal with the breakage?". There are a lot of different types of breakage, and something we analyze constantly when communicating to external parties. When there's an easy "you'll start to get a compilation/runtime error about X, so then do Y" model to follow, it's not much of an issue to disrupt users, since the path to action is clear. Even if you don't save too many characters with this, it's easy to justify if it simplifies, since the action to resolve is easy.
A more difficult one is when there's a bidirectional breakage: old code will not compile/work on a newer compiler version, and the fixed code will not compile/work on an older compiler version. We had this with the Wasm debug table formats change. This is much more annoying than the above, because one can't write any kind of "one ideal form of code" that would be compatible with different versions. We want to avoid bidirectional breakages whenever possible, since this impacts distribution update paths in ecosystems. If it's not possible, then we should aim to be diligent to identify where such bidirectional breakages lie, so that those will be easy to discover.
Removing the
Module.{preRun,postRun,preMain,callMain,run,addOnPreRun,addOnExit,...}
architecture is probably next on the list in terms of difficulty. Completely doable, and as a feature I think well "blocked out" into its own box to refactor. However this will need explainers, migration guides, or similar that instruct developers how to proceed. There is no love lost if we deleted these altogether at some point, for example in favor of the now newerPromise.then
mechanisms, but we need to make sure to help people have a path to migrating to this.Then there are the changes where we know something will break, but do not want to think about how it will manifest, which existing features are (in)compatible with the changes, how to migrate, or debug. These are in the red flag zone - users will get angry if they will need to research new breakages after someone else's PR lands and if they discover the breakage was not an accident but intended.
Shrinking code size and ES6 modularization are very orthogonal, we can certainly add ES6 module structure even without putting any effort in to shrinking code at all. Both of these features can be worked in parallel as well.
kripken commentedon Nov 22, 2017
@juj - good points, agreed. I'd add that I think many of the changes here could either have compile-time error messages, or if not, then at least run-time errors, something like this:
ccall
,dynCall_*
, etc.) is no longer exported by default onModule
(so our DCE can remove it automatically from the final JS).ASSERTIONS
, we emit a stub for the thing we removed (onModule
, in these examples), basically an assertion that it is never called, saying you need to export it if you want to use it. This increases code size inASSERTIONS
mode, but that's ok. And we already encourage people to use that mode when things don't work well, so there is a reasonable path for users to get a clear error message telling them what to do.ASSERTIONS
mode shows the message.I think doing that (+ compile-time errors when possible etc.) would reasonably mitigate the breaking changes we are proposing here.
kripken commentedon Nov 22, 2017
Opened #5836 with some data on the JS we emit for one testcase, and a list of tasks for it.
floooh commentedon Nov 28, 2017
Interesting thread! I share the concern about adding more -s flags. Main reasons:
EMSCRIPTEN_KEEPALIVE
vs the-s EXPORTED_FUNCTIONS
, the EMSCRIPTEN_KEEPALIVE way is infinitely better. A similar example is Visual Studio's ```#pragma comment(lib, "xxx")" versus adding a linker command line optionemcc bla.c -o bla.html -O3
, adding a lot of options to get the smallest possible build would make this much less appealingI think the ability to build without a build system is especially important for beginners (in the sense of "people new to WASM/asm.js", or when building WASM modules for an app that's primarily written in JS.
So from my point of view, a way to move some of these configuration options into the source code would be highly appreciated (although I have no good answers of how to achieve this, but EMSCRIPTEN_KEEPALIVE or the custom #pragmas point into the right direction).
NO_EXIT_RUNTIME by default (#5878)
stale commentedon Sep 19, 2019
This issue has been automatically marked as stale because there has been no activity in the past year. It will be closed automatically if no further activity occurs in the next 7 days. Feel free to re-open at any time if this issue is still relevant.