-
Notifications
You must be signed in to change notification settings - Fork 695
Indirect call signature checks shouldn't be nominal #452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Dynamic linking can be made to work with the current design by adding a label to signatures in the modules' respective signature tables and checking label equivalence across modules. That kind of punts the problem to user space, admittedly. There was a significant use case envisioned that labels are more fine-grained than structure signature checks, so, e.g. a C++ compiler can also enforce a secondary security property such as a call to a vtable method has to not only be the same signature but from the same method family. The implication of not having a signature table is that engines have to canonicalize signatures and internally generate the table for fast dynamic checking. |
I'm not saying there shouldn't be a signature table, just that it's not necessary to define the semantics of signature checking in terms of table indices. It's necessary right now to distinguish between identical signatures that are in the signature table multiple times, but I don't think that's desirable. With or without explicit tables, linking two modules needs to canonicalize signatures. I can see why it's useful to have a label to distinguish function types that are otherwise identical, but that implies there is some canonicalization that needs to make that distinction. |
@titzer Can we take it that this 'significant use case' is no longer being requested to be accommodated via specific runtime support? If not then can someone please elaborate on the requirements? I think it would be much cleaner to make these explicit arguments, rather than implicit via the signature tables, so that they are not conflated with function indexes. So the function signature might have as one argument a runtime_managed_index that could not be forged, could not be accessed by the wasm code at runtime, just generated as an argument as a call site, or tested in the callee by reference to this argument in a dedicated test operation. Not sure how you would want them managed, but perhaps someone understanding the requirements could work it out. |
I think it's really attractive to have an explicit signature table at the binary encoding level and it's actually more work to prevent duplicates while at the same time ruling out finer-grained-than-signature-match CFI, so I think it makes sense to represent the signature table explicitly in the AST/semantics. However, the common case of dynamic linking will want structural unification. An alternative to labeling (which requires the main module and dylib to agree on a labeling scheme which thus becomes part of the ABI) is to allow the elements of the signature table of dylibs to be marked as "canonicalized" which means they are unified with matches in the main module's sig table. This should preserve the expressive power while keeping the common case simple. It also requires no cooperation between main and dylib modules. |
I agree.
I don't think that's true if you have to support dynamic linking, which implies finding identical signatures somewhere in the loader.
I don't see how you would implement such a CFI system without affecting the ABI.
What if the main module sig table contains multiple signatures that match it; which one is it unified with? What about dynamic libraries calling functions defined in another dynamic library rather than the main module? |
Ah, that's a good point: if one wanted to support dynamic linking and the fancy CFI, you would need labels to link signatures point-to-point as @titzer suggested above; the unification can't just pick one. But if labels is the only way to achieve the 99% use case of just unifying same-type signatures, then that's a pretty big wart added to the ABI (and perhaps binary size) to support this fancy-CFI corner case. Now we could just keep the door open and start with:
and leave it as a future feature to support labeling (which would support fancy-CFI + dynamic linking). The question is whether we'd later come to regret allowing duplicates, so maybe a more conservative route would be to disallow dups initially (but keep the explicit table) and wait to build up a concrete use case for allowing dups. |
One obvious solution would be to treat the types not as nominal definitions but as structural definitions. That is, duplicates are fine, but they define the same type. Implementations could do some normalisation of type references at module instantiation time to still keep actual runtime type tests a simple comparison. This would extend relatively easily to dynamic linking later. Or is that the same you are suggesting, @lukewagner? |
@rossberg-chromium The issue is that if we force normalization, then it locks out this use case of intentionally having duplicates for the purposes of finer-grained CFI (concrete example: you have a separate signature index for |
Yeah, I'm not sure it's a low-level type system's role to enforce such nominal typing guarantees. And even if so, making function signatures nominal seems the wrong approach -- in your example, |
@rossberg-chromium I agree it's not the low-level type system's role to enforce; this was more like a raw tool to provide to the compiler to use as it chooses. So any disagreement with changing the MVP to disallow duplicates in the signature table? One could also imagine that we take out the sig table entirely and embed the signatures in all uses, but this seems strictly worse from a size/decoding pov. |
Disallowing duplicates sounds fine. Might be a bit of a nuisance for both producers and consumers, but nothing dramatic, I'd think. |
This is meant to implement the changes discussed in WebAssembly/design#452. This eliminates the need to declare an explicit type reference for indirectly-called functions, so I also tried to eliminate the redundancy and potential mismatch between a function's type reference and its directly declared parameters and result, so I made any function declaration implicitly introduce a type matching its directly declared parameters and result. But this approach makes an explicitly indexed type table much less useful, since you no longer have complete control over its contents. To resolve that, I removed the declarations, and references to, type table entries. The type table is created implicitly from the set of unique types used by func and call_indirect. I also made the syntax slightly more uniform by defining a func_type production in the form (func_type ...), and using it in both call_indirect and import. I did not change func to use it, but I will submit another PR if people like this change.
This is meant to implement the changes discussed in WebAssembly/design#452. This eliminates the need to declare an explicit type reference for indirectly-called functions, so I also tried to eliminate the redundancy and potential mismatch between a function's type reference and its directly declared parameters and result, so I made any function declaration implicitly introduce a type matching its directly declared parameters and result. But this approach makes an explicitly indexed type table much less useful, since you no longer have complete control over its contents. To resolve that, I removed the declarations, and references to, type table entries. The type table is created implicitly from the set of unique types used by func and call_indirect. I also made the syntax slightly more uniform by defining a func_type production in the form (func_type ...), and using it in both call_indirect and import. I did not change func to use it, but I will submit another PR if people like this change.
This is meant to implement the changes discussed in WebAssembly/design#452. This eliminates the need to declare an explicit type reference for indirectly-called functions, so I also tried to eliminate the redundancy and potential mismatch between a function's type reference and its directly declared parameters and result, so I made any function declaration implicitly introduce a type matching its directly declared parameters and result. But this approach makes an explicitly indexed type table much less useful, since you no longer have complete control over its contents. To resolve that, I removed the declarations, and references to, type table entries. The type table is created implicitly from the set of unique types used by func and call_indirect. I also made the syntax slightly more uniform by defining a func_type production in the form (func_type ...), and using it in both call_indirect and import. I did not change func to use it, but I will submit another PR if people like this change.
This is meant to implement the changes discussed in WebAssembly/design#452. This eliminates the need to declare an explicit type reference for indirectly-called functions, so I also tried to eliminate the redundancy and potential mismatch between a function's type reference and its directly declared parameters and result, so I made any function declaration implicitly introduce a type matching its directly declared parameters and result. But this approach makes an explicitly indexed type table much less useful, since you no longer have complete control over its contents. To resolve that, I removed the declarations, and references to, type table entries. The type table is created implicitly from the set of unique types used by func and call_indirect. I also made the syntax slightly more uniform by defining a func_type production in the form (func_type ...), and using it in both call_indirect and import. I did not change func to use it, but I will submit another PR if people like this change.
My PR meant to address this has gotten a little stale, so I wanted to revisit this thread and try to figure out how to move forward on the issue. While implementing the PR, I ended up eliminating the explicit signature table from the text format. The problem was that not only explicit An alternate solution would be to require explicit signature table entries to be declared before any functions, and to require functions to reference one of those explicit signature table entries. That would correspond to the binary format closely, but would be pretty obtuse for a text format. |
This is meant to implement the changes discussed in WebAssembly/design#452. This eliminates the need to declare an explicit type reference for indirectly-called functions, so I also tried to eliminate the redundancy and potential mismatch between a function's type reference and its directly declared parameters and result, so I made any function declaration implicitly introduce a type matching its directly declared parameters and result. But this approach makes an explicitly indexed type table much less useful, since you no longer have complete control over its contents. To resolve that, I removed the declarations, and references to, type table entries. The type table is created implicitly from the set of unique types used by func and call_indirect. I also made the syntax slightly more uniform by defining a func_type production in the form (func_type ...), and using it in both call_indirect and import. I did not change func to use it, but I will submit another PR if people like this change.
With #682, indirect call signature checks are structural, rather than nominal, due to the problem of comparing type indices across modules. Consequently, this issue is now fixed! |
While debugging a signature mismatch exception thrown on indirect calls between different modules, see emscripten-core/emscripten#7082 (comment), it appears that major browsers are still doing nominal signature checks on indirect calls, rather than structural checks. I'm aware this repository focuses on specifications, not implementations, but before opening issues in the respective browsers I wanted to confirm that this is indeed an implementation issue (cc. @lukewagner). Basically, the implementation used in the V8 WASM Interpreter (made debugging easier for me), which is also followed by the WASM JIT's in Chromium/Firefox performs the following check:
The issue was triggered during an indirect call from module B to module A. Here, Could you please confirm me whether this behavior goes against the specifications? |
Signatures are definitely matched structurally, even cross-module, and there are a number of tests for this. Especially if you're seeing the same behavior on both Firefox and Chrome, I'd look more closely at the source module to check that the structural signatures are indeed the same on both sides. I can't confidently speak to the V8 interpreter snippet you're looking at, but if it's anything like FF's impl, "sig_id" is going to be a post-structural-unification value, not the original type section index. |
Here's an example showing cross-module function instance(bytes) {
return new WebAssembly.Instance(new WebAssembly.Module(bytes));
}
/*
(type (;0;) (func))
(type (;1;) (func (param i32 i32) (result i32)))
(func (;0;) (type 1) (param i32 i32) (result i32)
local.get 0
local.get 1
i32.add)
(export "add" (func 0))
*/
const adder = instance(new Uint8Array([
0x00, 0x61, 0x73, 0x6d, 0x01, 0x00, 0x00, 0x00, 0x01, 0x0a, 0x02,
0x60, 0x00, 0x00, 0x60, 0x02, 0x7f, 0x7f, 0x01, 0x7f, 0x03, 0x02,
0x01, 0x01, 0x07, 0x07, 0x01, 0x03, 0x61, 0x64, 0x64, 0x00, 0x00,
0x0a, 0x09, 0x01, 0x07, 0x00, 0x20, 0x00, 0x20, 0x01, 0x6a, 0x0b
]));
/*
(type (;0;) (func (param i32 i32) (result i32)))
(func (;0;) (type 0) (param i32 i32) (result i32)
local.get 0
local.get 1
i32.const 0
call_indirect (type 0))
(table (;0;) 1 funcref)
(export "table" (table 0))
(export "call_indirect" (func 0))
*/
const caller = instance(new Uint8Array([
0x00, 0x61, 0x73, 0x6d, 0x01, 0x00, 0x00, 0x00, 0x01, 0x07, 0x01, 0x60,
0x02, 0x7f, 0x7f, 0x01, 0x7f, 0x03, 0x02, 0x01, 0x00, 0x04, 0x04, 0x01,
0x70, 0x00, 0x01, 0x07, 0x19, 0x02, 0x05, 0x74, 0x61, 0x62, 0x6c, 0x65,
0x01, 0x00, 0x0d, 0x63, 0x61, 0x6c, 0x6c, 0x5f, 0x69, 0x6e, 0x64, 0x69,
0x72, 0x65, 0x63, 0x74, 0x00, 0x00, 0x0a, 0x0d, 0x01, 0x0b, 0x00, 0x20,
0x00, 0x20, 0x01, 0x41, 0x00, 0x11, 0x00, 0x00, 0x0b
]));
const add = adder.exports.add;
// Direct call.
console.log(`call: 3 + 4 => ${add(3, 4)}`);
// Indirect call.
caller.exports.table.set(0, add);
console.log(`call_indirect: 3 + 4 => ${caller.exports.call_indirect(3, 4)}`); |
@lukewagner @binji Thanks for your feedback. Indeed, the underlying issue was caused by Emscripten: the generated code had a mismatch between the reported function pointer and the index of said function in the table (apparently due to Sorry for the inconvenience and thanks for providing that minimal example, it helped tracking the issue! |
The design and spec interpreter use an explicit signature table to check
call_indirect
signatures. This is effectively a nominal type system for function pointers: you can have two signature table entries with the same parameters and return type, and they are distinct types as far ascall_indirect
is concerned.This rules out calling a pointer to a function in a different module, since you can't refer to the target module's signature for the function, and in general doesn't work with dynamic linking.
I think that
call_indirect
signature checks should be defined structurally: that if acall_indirect
expects the same parameter types and return type as the function identified by its dynamic index, the signature check should succeed.The signature table makes sense from an encoding and implementation perspective, but I don't think it's necessary to define WebAssembly semantics. So I think the design shouldn't mention a signature table, and should be explicit that the signature check is structural.
The text was updated successfully, but these errors were encountered: