-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Generate WebAssembly code from C/C++ code [wasm JIT] #7082
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This is definitely an important feature. You can do this today, but with some overhead - in time, wasm is expected to add features to make this better. The way to do this now is to create another wasm module at runtime, passing it the same Memory and Table as the original. Then that new module can add its functions to that Table, and then the original code can call them (indirectly, using a function pointer) and vice versa. To put it another way, Emscripten already supports loading dynamic libraries containing wasm, and they use the same mechanism - new wasm loaded at runtime, sharing the Memory and Table, and calls between modules are done by function pointers. (Dynamically loaded libraries can also import functions from the original code directly, avoiding the cost of an indirect call.) So basically in the previous paragraph I was describing creating a tiny dynamic library at runtime in a JIT manner. Practically speaking, to do this you need some sort of library you can run on the client that can emit a wasm file (that is, handle all the binary encoding details). One option here is Binaryen, which can also convert basic blocks + branches to wasm (which has structured control flow), so it can function as a compiler backend in a sense. I've actually been hoping to find time to do a real-world example of this (perhaps on PyPy), but haven't gotten around to it, yet. If you explore this area I'd be very interested to help out. |
Considering that my intended use case [1] involves dynamically generating 10,000's of functions, having each of them as a separate module is probably not a good idea. In that case I'll wait for the WASM specifications to include features to improve JIT compilation. Once that's available I'd be happy to help at integrating it into Emscripten. Also, thanks for pointing out Binaryen, it's really useful. I'll experiment with that a bit. [1] My plan was adding a WASM backend to QEMU's binary translator, in order to use Unicorn.js at (hopefully) near-native speeds, avoiding the current x1000-factor slowdown. |
I actually don't think even 10,000 modules is that bad :) Each would just contain 1 function plus an import for the table and an import for the memory, so compiling it is almost the same as just compiling the function. The main downside I can think of is the VM would likely not use multiple CPU cores (which VMs can do if many functions are in a single module). Over the weekend I did a little proof of concept of this, actually, I got the pypy.js JIT to emit wasm. Seems to work as expected (however, it was just a quick hack, see details there). |
@kripken Thanks for your proof-of-concept. After taking a break the past months regarding this issue, these past days I've went back to it and attempted to create a minimal self-contained example to illustrate your suggestions above:
The current implementation looks like: https://gist.github.com/AlexAltea/daf4819856a3f47a58e2a1588dbb1ed5 However there have been two issues so far:
Judging by your initial explanation, there's nothing in the snippet that strikes me as wrong. Is there anything that I might have overlooked? Thank you. PS: Note that for the sake of simplicity, I've decided to go with |
Why not, what error does it hit? (same as in the second point?) You might be hitting a bug in a VM - worth testing in multiple VMs, and latest versions, as bugs may have been fixed. If that's not it, then perhaps you do have the signature wrong? (Calling from JS is more permissive as it will add/remove params, etc.) If that's not it either, if you create a full testcase I can take a look at it. |
@kripken Nope, the 1st issue (AOT-to-JIT indirect calls) is that it seems to call a "different" function that it's supposed to (in earlier tests I also got signature mismatch iirc). I've updated the test to show this issue: https://gist.github.com/AlexAltea/daf4819856a3f47a58e2a1588dbb1ed5. // Link module
uint32_t adder_wrapper_index = EM_ASM_INT({
var jit_binary = new Uint8Array(wasmMemory.buffer, $0, $1);
var jit_module = new WebAssembly.Module(jit_binary);
var jit_instance = new WebAssembly.Instance(jit_module, {
env: {
memory: wasmMemory,
table: wasmTable,
}
});
console.log('WASM Table length: ' + wasmTable.length);
var adder_wrapper = jit_instance.exports["adder_wrapper"];
var adder_wrapper_index = addWasmFunction(adder_wrapper);
console.log('WASM Table length: ' + wasmTable.length);
return adder_wrapper_index;
}, result.binary, result.binaryBytes);
printf("adder_wrapper_index: %d\n", adder_wrapper_index);
uint32_t res0 = ((uint32_t(*)(uint32_t,uint32_t))adder_wrapper_index)(3, 4); // This calls the wrong function!
printf("result #0: %d\n", res0); // Should be 7 (but is 4, why?) This JIT-module imports env.memory and env.table, and contains just: (module
(type $iii (func (param i32 i32) (result i32)))
(export "adder_wrapper" (func $adder_wrapper))
(func $adder_wrapper (; 0 ;) (; has Stack IR ;) (type $iii) (param $0 i32) (param $1 i32) (result i32)
(call_indirect (type $iii)
(local.get $0)
(local.get $1)
(i32.const 133) ;; function pointer to `adder`, defined in the AOT-module as:
;; uint32_t adder(uint32_t a, uint32_t b) { return a + b; }
)
)
) The output is:
This is the case both in latest revisions of Chrome and Firefox, so I doubt it's a bug, but I cannot spot any potential API misuse anywhere either.
@kripken Regarding the 2nd issue (JIT-to-AOT indirect calls), the signature seems to be correct, the AOT function is defined as:
and the type passed to BinaryenCallIndirect is defined as:
I don't understand what might be failing here. If you don't see any other reason for this, I'm afraid I'll need to attach a debugger to my browser's WASM engine... |
Calling a different function and calling with the wrong signature are probably the same issue. How are you compiling that source file? I'd expect problems with asm.js function tables if it's using asm2wasm - since asm.js function tables add offsets to pointers. In that case, building with |
@kripken I'm not using asm2wasm, but just Binaryen to generate the JIT-module on the fly (see the snippet). The AOT-module that generates it is just an 80 lines of C, built with Emscripten. I've built V8/D8 and found the root cause for the issues with JIT-to-AOT indirect calls:
Here, This issue was reported in WebAssembly/design#452, and fixed in WebAssembly/design#682, which enforces indirect call signature checks to be structural rather than nominal. Browsers still seem to do nominal checks. Note that while I'm using the V8's WebAssembly interpreter (makes debugging easier), I also replicated this behavior on Firefox and Chrome. @kripken What's your take on this? Is this indeed a browser bug to report? Thanks a lot for your time. :-) PS: Sorry for using the Emscripten issue tracker for this, I didn't expect rabbit hole to go this far. |
Oh, I saw Yeah, those checks should be structural I believe. Perhaps the best thing is to open an issue on the wasm design repo, if you see that browsers do not actually obey this. Might be worth looking in the wasm spec test suite first to see if this is tested or not, and reporting that in the issue - if not, creating a small testcase for them would be good. |
@kripken Thank you, as you mentioned, the The minimal working version of my proof-of-concept is here: Can be compiled via:
PS: Feel free to close this issue if you want! (Not sure how far away are proper WASM JIT features from making it into the specification). |
This issue has been automatically marked as stale because there has been no activity in the past year. It will be closed automatically if no further activity occurs in the next 7 days. Feel free to re-open at any time if this issue is still relevant. |
This is rather a question or feature request rather than an actual issue. I've made sure to thoroughly search for similar issues before both on GitHub and Google, but I have found no relevant results.
I'm generated on generating WebAssembly from C/C++ code and executing it in seamless manner, i.e. using just function pointers. Let me explain this with an example:
On native targets it is possible to JIT-compile code like this:
Question is: Is there something similar we could use from C/C++ code when compiling to WebAssembly?
I'm thinking of something like this:
The reason I'm asking for this, is that the only binary translators that can be compiled with Emscripten are interpreters which are known for heavy performance penalties (ranging x20-x100). By having an interface to create WebAssembly functions and seamlessly execute them, we could add a WebAssembly backend to many binary translators and have near-native performance.
As a further example: This is already possible with JavaScript (i.e.
eval
'ing strings) and some projects use that for performance reasons, e.g. see jslm32.I wonder whether the WebAssembly specification is ready for such a feature. To be honest I'm not that familiar with it (maybe @kripken can comment about it?).
The text was updated successfully, but these errors were encountered: