-
Notifications
You must be signed in to change notification settings - Fork 695
[js-api] Allow JS functions to be directly added to via table.set
?
#1408
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
It's a good question. The last time I thought carefully about the implementation, it seemed like there were a few hard tradeoffs: The easiest implementation I can think of is for each The alternative is to try to do everything in a generic JS thunk that can be called by So it would be a question for JS engines as to whether these tradeoffs were worth the benefit. |
When you say it "adds extra code to every call_indirect" you don't mean adding code that would run for check for every call_indirect, right? The "am I calling untyped JS?" would only happen in the case where the normal signature check fails, right? (The case that currently just traps) |
@sbc100 Yes, good clarification. |
Given this should have basically zero performance overhead for wasm-to-wasm indirect calls I think it seems like a reasonable change. Making this change would eliminate the horrible hack that emscripten has to do today as well as being more flexible and simple and more compact than the proposed |
I'm not sure I follow. What concrete Wasm type would be assigned to such a function? Where would you derive it from? When supplying a JS function for an import, that type is derived from the import description. But no equivalent exists for table.set, because tables are just defined to be generic funcref. It almost sounds like you are envisioning that we extend core Wasm with a new kind of function value that has no Wasm type -- i.e., we would bring untyped functions into Wasm itself, which behave polymorphically in call_indirect. But it's not clear how such a function reference would behave in other contexts than call_indirect (for example, with casts as under the GC proposal). This looks like a serious can worms to me. I'm rather skeptical that the benefit over WA.Function is large enough to justify opening it. Having untyped values fundamentally violates the spirit of Wasm being typed and may have all sorts of nasty consequences downstream. |
Yes, I was imagining as your describe: A table slot containing a polymorphic JS function that would never result in a run-time type check failure when called indirectly. Its useful for dynamic linking where the signature of the final wasm function that will live in the slot is not known up front. A JS shim function can trigger the loading a shared library and propagate all arguments to the loaded wasm function (and replace itself in the table). Without this the (lazy loading) dynamic linker needs to know not just names but also function signatures. Today this works fine for directly imported functions (we can supply a polymorphic shim for all function imports) but it does not work for functions imported by address only (table slots). |
That's true, but given the js-types proposal, can't the linker read it off directly from the import descriptions along with the names? And then just apply WA.Function to the shim? The problem with what you propose is that it is not just an extension to the JS API. It is observable from within Wasm if there exists a function object that can be successfully called with different types. So this would be a notable extension to core Wasm itself that requires a spec change and affects non-JS embeddings. We'd be leaking a JS-ism into Wasm's semantics, something we tried to avoid so far. And there are consequences. In terms of the GC proposal, call_indirect is really just the optimised composition of table.get, cast, and call_ref. But what would happen if you used these operations separately? For example,
Imagine somebody stuffs a raw JS function into table slot 0, then calls func1 then func2. To be coherent with call_indirect and such a function's nature, this should succeed. But how? Either every call_ref would have to make an extra case distinction for untyped functions (which defeats the purpose of typed function references), or a cast would have to allocate wrapper functions (but a cast is not supposed to change the identity of a reference). Or untyped functions would match concrete types only in call_indirect, not anywhere else, but that would be rather odd from a type system perspective -- worse, it would make the dynamic linking mechanism incompatible with forming typed references inside the module. |
So SpiderMonkey already does not implement |
The problem occurs when a module only imports that address of a function as an edit: I could work around this by adding a otherwise-unused import of the function itself, but that import would (currently at least) be DCE'd by binaryen's optimizer. |
Following up, we ran experiments in a language with the same interop challenge posed here. We compared our implementation using SpiderMonkey's callee-side approach for On the other hand, in the GC proposal we are observing that run-time casts of objects have quite noticeable overhead, which suggests that V8's caller-side approach to Now, with the callee-side approach, @lukewagner notes that you could have a single stub that would handle all So I believe we should be able to support this functionality with little-to-no overhead; in fact, there may not even be a performance advantage to using |
This would be super helpful if this was supported in For example rather than an API that has a bunch of back-forward calls to-and-from wasm to iteratively build up a tuple say, we could have a single host function like: const makeTuple = new WebAssembly.Global(
{ value: "anyfunc" },
// JS function that makes the "tuple"
(...values) => values,
);
const instance = new WebAssembly.Instance({
host: {
makeTuple,
},
}); Wasm functions then could call such a host function with whatever parameter count/types they want using (module
(global $makeTuple (import "host" "makeTuple") (funcref))
(table 1 funcref)
(elem (i32.const 0) (global.get $makeTuple))
;; type for making empty tuples
(type $make0Tuple (result externref))
;; type for making 3-element tuple
(type $make3Tuple (param i32) (param i32) (param i32) (result externref))
;; any types work fine
(type $makePair (param i32) (param f32) (result externref))
(func (export "makeEmptyTuple") (result externref)
(call_indirect (type $make0Tuple) (i32.const 0))
)
(func (export "make123Tuple") (result externref)
(call_indirect (type $make3Tuple)
(i32.const 1)
(i32.const 2)
(i32.const 3)
(i32.const 0) ;; index in table of makeTuple
)
)
(func (export "makePair") (result externref)
(call_indirect (type $makePair)
(i32.const 42)
(f32.const 9999)
(i32.const 0) ;; index in table of makeTuple
)
)
) This pattern would generalize to things like initializing records and such: (module
;; ...initialize tables, global imports, etc similar to previous example
;; creates a record { x: 3, y: 5 }
(func (export "makePointRecord") (result externref)
(call_indirect (type $make2Record) ;; type of calling host.makeRecord with two pairs
(call_indirect (type $make2Tuple) ;; type of calling host.makeTuple with 2 items
(call_indirect (type $make1String) ;; type of calling host.makeString with 1 char
(i32.const 120) ;; "x"
(i32.const MAKE_STRING_INDEX)
)
(i32.const 3)
(i32.const MAKE_PAIR_INDEX)
)
(call_indirect (type $makePair)
(call_indirect (type $make1String)
(i32.const 121) ;; "y"
(i32.const MAKE_STRING_INDEX)
)
(i32.const 5)
(i32.const MAKE_PAIR_INDEX)
)
(i32.const MAKE_RECORD_INDEX)
)
)
) where we provide: const instance = new WebAssembly.Instance(module, {
host: {
makeRecord: (...pairs) => {
const record = { __proto__: null };
for (const [key, value] of pairs) {
record[key] = value;
}
return record;
},
makeTuple: (...values) => values,
makeString: (...codePoints) => String.fromCodePoints(...codePoints),
},
}); Obviously GC proposal (and new |
Rather than the somewhat "scary" idea of directly calling JS Functions, what if the JS API The cons I see to this approach would be the extra branching overhead per const fn1 = console.log;
my_table.set(0, fn1);
const f2 = my_table.get(0);
// fails assertion
assert(fn1 === fn2); Regardless, this has the benefit of the JS consumer not needing to know the Wasm types, ahead of time, while maintaining that engines get to add their JS<->Wasm conversion wrappers. Does this suit your use case? |
Sadly this doesn't work since table slots today are Also, I'm not sure I agree its "scary" to directly call JS functions... its something we already all the time when JS supplies imports to a the wasm module. |
Update: ah, I just saw that in WebAssembly/js-types#16 after this issue... |
Even if we had a different table to every possible signature, the problem would then be more like: given a JS function which table should I try to put it in. The point of this issue is that I want the JS function act as a polymorphic anyfunc that will never trap can called using |
Understood, I didn't quite get that, that was the intention at first. |
As of today JS functions can be directly supplied as imports, but they cannot be directly added to table.
table.set
only accepts native WebAssembly functions. As of today, there is no way to convert JS functions to WebAssembly functions. An API to creating WebAssembly functions is proposed in https://github.com/WebAssembly/js-types, but that requires the caller to know function signature ahead of time.Rather than converting JS functions to WebAssembly functions for the purposes of adding them to tables, could we not simply allow JS functions to be added directly?
When supplied as imports, JS functions have universal polymorphic behaviour in that one can supply any JS function to any import, and indeed to all imports. No signature checking is done, and the provider of the function doesn't need to know the signature ahead of time. The number of arguments doesn't even need to match. This is a nice property to have in dynamic languages and in particular is makes lazy binding and dynamic linking easier.
For example, this property means we can use Proxy object or resolve imports without being aware of the signature of an import:
https://github.com/emscripten-core/emscripten/blob/main/src/library_dylink.js#L470
A simplified version of this code allows use to use single function to resolve symbols dynamically at runtime:
While these universal (I guess you could call them variadic?) functions work fine as function imports, they are not permitted by
table.set
. This means that when we do dyanmic linking in emscripten today its easy to do lazy binding function imports, but lazy binding of function address imports is not possible, at least not without also knowing the signature of the function. I can't take the result of themakeHandler
function at pass it totable.set
.To work around this limitation we used are currently considering adding adding extra signature information in a custom section so that table addresses can be dynamically assigned before all modules in the graph are loaded.
Is there any fundamental reason why we can't just do
table.set(myHandler)
and have that handler universally usable bycall_indirect
.. it might mean that the call_indirect could be slightly more efficient since the signature check could be skipped (since JS functions can't/don't do signature checks IIUC).The text was updated successfully, but these errors were encountered: