Skip to content

call_indirect #488

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ghost opened this issue Dec 1, 2015 · 9 comments
Closed

call_indirect #488

ghost opened this issue Dec 1, 2015 · 9 comments

Comments

@ghost
Copy link

ghost commented Dec 1, 2015

There appear to be a deficiency with the current thinking around call_indirect:

A call_indirect specifies the expected signature of the target function with
an index into a signature table defined by the module. An indirect call to a
function with a mismatched signature causes a trap.

The problems here are three-fold (at least):

  1. A higher-order function (where the arguments to the function can contain a function, or where the function returns a function) cannot be invoked. The reason is that a call_indirect assumes that the function to be called is in the same module as the caller. A higher-order function cannot know where its argument functions are located.
  2. A higher-order function will have a type signature that does not correspond with int32->int32 style types. Its arguments and/or its return type will not be a normal value.
  3. In a language with generics (which includes C++), the actual type of a function may not correspond literally with the call site and yet still be perfectly valid. (e.g., a function with type X->X can be invoked with integer arguments, float arguments, even function arguments).

BTW, this applies to C/C++ as well as to functional languages.

Given that the current function table approach is fixed, call_indirect must be extended to allow inter-module calls where the module of the callee is dynamically determined.

An alternate is to invent something along the lines of the MethodHandle from the JVM. (Except that that also embeds 'free variables' - another issue with call_indirect).

@ghost
Copy link
Author

ghost commented Dec 2, 2015

Wasm is a primitive language and it does not deal with higher level language objects and abstractions directly just as hardware generally does not, well the x86 and ARM and SPARC and PPC etc. For a dynamic language the call will need to pass arguments on a stack implemented in the linear memory and pass the stack pointer(s) in the wasm arguments. The higher level language compiler might be able map some known fixed arguments to wasm fixed arguments.

@sunfishcode
Copy link
Member

Concerning 1., What WebAssembly calls modules may differ from what "modules" mean in other contexts. A WebAssembly module roughly corresponds with a fully linked executable file, including any static libraries linked into it.

Also, the scope of a call_indirect is the "instance", which is created by instantiating a module, and on top of that, dynamic linking is expected to be defined as importing additional modules into an existing instance. Consequently, functions anywhere in the main executable or in dynamic libraries would be callable from call_indirect. Many programs will only utilize a single instance.

Lastly, there is also an open design issue about whether the semantic distinction of calling a function in another instance should be dropped: #421

Concerning 2., Higher-level language types will need to be lowered when translated in to WebAssembly.

Concerning 3., C++ compilers fully resolve templates at compile time before producing wasm, so when wasm code is produced, instantiated function templates just look like regular statically resolved functions with statically typed arguments. Generics in languages that use dynamic resolution will need to be lowered into primitive operations and types in wasm.

@ghost
Copy link
Author

ghost commented Dec 2, 2015

Well, this

A WebAssembly module roughly corresponds with a fully linked executable file, including any static libraries linked into it.

raises other issues. Nearly every web site has at least (!) one copy of jquery in it. The browser mitigates the effect of this by caching. However, if a webasm program is expected to be fully linked prior to loading then there is likely to be a lot of redundant loading together with reduced caching effectiveness.

@ghost
Copy link
Author

ghost commented Dec 2, 2015

Currently, the spec of a function type signature does not permit the arguments to be properly typed. A type signature which states integer or blob does not support verification of code. Given that programs are going to be executed by innocent users I would have thought that code verification should be high on the list of priorities for this group.

@ghost
Copy link
Author

ghost commented Dec 2, 2015

@fmccabe Dynamic linking seems to be a matter still under discussion. The wasm function arguments are 'properly typed' and verified from the perspective that is currently intended. It's not intended to 'support verification of code' at the level of implicitly checking ranges on integers or even declaring they are signed or unsigned.

A higher level language could emit explicit range checks to add such checking, and I have been exploring taking advantage of this by supporting type derivation in the deployment language but there has been no support for this - not even improving it in JS for asm.js! I would support adding operations for improving the encoding efficiency and readability of range checks that a higher level language might emit for it's own safety, and this could be justified in terms of improving the encoding efficiency of very common patterns. At the text source code level these might be cleaner to add as declared types for function arguments and perhaps a function result type and types for expression values and function and block local variables.

@titzer
Copy link

titzer commented Dec 3, 2015

We considered several alternatives for indirect calls, including adding a
new set of static function types that could be used as local variables,
parameters, and return types. The main reason we've opted for an indirect
call table with a dynamic check (so far) is to avoid a big bump in the
complexity of type checking. Another problem is that function pointers need
to be "handlified" to an integer value in order to be stored in the linear
memory. That operation is essentially equivalent to the indirection of the
table.

I think we will want to have fully statically typed function references in
the future, so that indirect calls can actually be statically checked for
signature matching, and no dynamic check is necessary.

One big advantage of the current scheme is that the indirect call table can
be used to implement vtables in the WASM engine's trusted space, and thus
cannot be damaged by a program bug, since it is not stored in the linear
memory. Handlifying statically typed function pointers to integers and
storing vtables in the linear memory doesn't have that property.

On Wed, Dec 2, 2015 at 8:22 PM, Frank McCabe [email protected]
wrote:

Currently, the spec of a function type signature does not permit the
arguments to be properly typed. A type signature which states integer or
blob does not support verification of code. Given that programs are going
to be executed by innocent users I would have thought that code
verification should be high on the list of priorities for this group.


Reply to this email directly or view it on GitHub
#488 (comment).

@kg
Copy link
Contributor

kg commented Dec 3, 2015

@sunfishcode

Concerning 1., What WebAssembly calls modules may differ from what "modules" mean in other contexts. A WebAssembly module roughly corresponds with a fully linked executable file, including any static libraries linked into it.

Also, the scope of a call_indirect is the "instance", which is created by instantiating a module, and on top of that, dynamic linking is expected to be defined as importing additional modules into an existing instance. Consequently, functions anywhere in the main executable or in dynamic libraries would be callable from call_indirect. Many programs will only utilize a single instance.

FWIW, most of this is violated by after-main dynamic linking and by JIT. A model where libraries cannot be demand-loaded at runtime and JITs cannot create new function signatures is not a model that will successfully host most JITs or applications that rely on plugin infrastructure.

fully linked executable file only holds until we get around to implementing dynamic linking, so we shouldn't design around it. It's a temporary simplification.

@sunfishcode
Copy link
Member

@kg Are there plans to make "after-main" dynamic linking work differently than regular dynamic linking in a way that affects what call_indirect can call?

@sunfishcode
Copy link
Member

Closing, as the original questions in this issue appear to be answered. If there are further questions or concerns, please file a new issue. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants