-
Notifications
You must be signed in to change notification settings - Fork 786
Invalid unreachable loop in text format #1541
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hmm, what if just remove return result for such kind of code? Like: (module
(func $0
(block $label$1
(loop $label$2
(unreachable)
)
)
)
) Or this sintetic example? Because this doesn't depend on (module
(func $0
(drop
(block $label$1 (result i32)
(loop $label$2
(nop)
)
)
)
)
)
|
It's true that removing the result would make this example work, but what would the new rules be for when to do so - how would that fit in the current typing rules? I'm not sure offhand how that could be done, and this is tricky stuff with many corner cases I'm afraid... In general, if there isn't a simple fix for something like this, it may just be something we document as a text format difference between binaryen and wat - there are already a few in the list. While we need binary format support to be perfect - that's the point of binaryen :) - the text format is meant to represent our internal IR, which has been diverging from wasm. We may want to just stop claiming to support |
For reference, the following is the saved binary from var mod = new binaryen.Module();
var funcType = mod.addFunctionType("v", binaryen.none, []);
var func = mod.addFunction("0", funcType, [],
mod.drop(
mod.block("label$1", [
mod.loop("label$2",
mod.unreachable()
)
], binaryen.i32)
)
);
mod.addExport("0", "0"); When reading it back, it becomes the original (apparently invalid) text format: (module
(type $0 (func))
(export "0" (func $0))
(func $0 (; 0 ;) (type $0)
(drop
(block $label$1 (result i32)
(loop $label$2
(unreachable)
)
)
)
)
) So...
is this just the text printer missing something the binary writer adds implicitly? Asking because when looking at the comments above, it appears to be more complicated than that? |
I believe the binary writer adds the extra unreachable after the loop. The reason is that Binaryen IR has an unreachable type for loops, but wasm binaries do not. To fix that, after it emits a loop that is unreachable it also emits an unreachable opcode, which ensures we are in an unreachable zone in the binary. And when reading wasm we do the opposite, in effect. It's not clear how to fix this for the text format, though:
|
What if we'd just do if (curr->type != none && curr->body->type == unreachable) {
o << maybeNewLine;
!minify && doIndent(o, indent);
printOpening(o, "unreachable");
o << ')';
} right below this line, similar to what the binary writer does?
(module
(type $0 (func))
(func $0 (; 0 ;) (type $0)
(drop
(block $label$1 (result i32)
(loop $label$2
(unreachable)
)
(unreachable)
)
)
)
) |
We could do that, but we'd also need to make changes in the text input code to handle that, and the bigger issue is that by making such changes the text format would no longer represent Binaryen IR as it currently does. And having a good way to print the internal IR is necessary for debugging and development. So we'd need to support two text formats, I'm not saying that's a bad idea, but it would take some work to do, and I'm not sure it's worth it. How important is it that binaryen can emit the wasm text format correctly? |
For AssemblyScript it's quite convenient to have a text format printer provided by Binaryen (while there isn't really a need to read text format), so it can be its sole dependency. I've tried to avoid adding WABT for example because it adds another megabyte or so of JS just for printing text format. |
I see, good point @dcodeIO So, if emitting wasm text is important but reading it is not, then that's not too hard to do, we can make a variant of the Before going down that route, is there anyone else that does care about reading proper |
Currently the JS build of wabt includes a lot of extra cruft -- it wouldn't be too hard to separate some of it out to slim down the library. |
@binji it will be great if |
@MaxGraey that's a bit outside the scope of |
Yah, may be better to create another separate tool for this routine. The main idea to determine which is codegen quality better for specific wasm operations or optimizations and how this changes during evolution some VM to get informative feedback and estimation about some binaryen performance optimizations. PS Sorry for offtop |
@binji "it wouldn't be too hard to separate some of it out to slim down the library." Yeah, we could make a wabt build with just the binary loading and text printing code included, perhaps? Might be as easy as only specifying those functions in the list of exported functions. |
@kripken Hm, not so sure now that I take a quick look. It seems at least 350k or so is the re2c generated lexing function! Haven't really dug into the rest, but it looks like the callback-based parsing might be bloating the resulting library too. |
E.g.
The loop is unreachable, but we need to emit an unreachable after the loop too, for the block, wabt says
Not sure why this is just a bug in the text format - the binary for it is fine.
The text was updated successfully, but these errors were encountered: