Skip to content

exnref-related error in writing binary #3114

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
kripken opened this issue Sep 10, 2020 · 9 comments · Fixed by #3129
Closed

exnref-related error in writing binary #3114

kripken opened this issue Sep 10, 2020 · 9 comments · Fixed by #3129
Assignees

Comments

@kripken
Copy link
Member

kripken commented Sep 10, 2020

emcc-0-base.wasm.gz

The attached wasm file is the wasm-ld output for test_exceptions_3 with wasm exceptions. The wasm-ld output is valid after https://bugs.llvm.org/show_bug.cgi?id=47413 but the test cannot be enabled yet as running

wasm-opt emcc-0-base.wasm -o out.wasm

results in an invalid wasm, wabt says

$ wasm2wat   --enable-all out.wasm
b.wasm:0000f7e: error: type mismatch in local.set, expected [exnref] but got []
0000000: error: type mismatch in try catch, expected [] but got [exnref]
b.wasm:00039f7: error: type mismatch in function, expected [] but got [i32, i32, i32, i32]
b.wasm:0004501: error: type mismatch in function, expected [] but got [i32]
b.wasm:00048b8: error: type mismatch in function, expected [] but got [i32]
[..]

(doing another roundtrip results in a valid wasm, and is why this wasn't noticed before - we do one fewer roundtrip now on many tests since at -O0 we don't run wasm-opt at all if we don't need to).

cc @aheejin

@tlively
Copy link
Member

tlively commented Sep 10, 2020

The relevant code snippet (from wasm-objdump) is

 000f79: 07                         |     catch
 000f7a: 02 40                      |       block
 000f7c: 21 6c                      |         local.set 108
 000f7e: 20 04                      |         local.get 4

So it looks like the problem is that the block is not annotated with an input type. Even though there is an exnref on the stack, the exnref is not available inside the block without a block parameter annotation.

The robust fix would be to implement block parameters throughout Binaryen, which I hope to do sometime after Poppy IR is done, but perhaps there is a more short-term fix as well.

@kripken
Copy link
Member Author

kripken commented Sep 10, 2020

Thanks for looking into it @tlively !

I don't think a short-term fix is needed as it's an experimental feature. We can just keep that one test disabled for now.

@aheejin
Copy link
Member

aheejin commented Sep 12, 2020

Hmm, I wonder how this has not been discovered so far; this is rather a serious bug..

As @tlively said, having block input type can solve this. But this makes EH dependent on multivalue, which I'm not sure if I want. (I elaborated a little more on this later in this comment)

And I was aware that Binaryen does not support block input parameters anyway, so this is a bug ;( I've been trying hard to avoid generating a wrapping block inside of catch for this very reason. Certainly the LLVM does not generate this kind of pattern.

A little background:
To simplify implementation, try-catch has followed a similar structure to if-else, which means try cannot have a label itself in Binaryen internal implementation. Only block and loop can have labels now; if labels are simulated by placing inner blocks in its true and false bodies respectively. try's implementation does the same thing for its try and catch body. So each of try body and catch body has a block. But when printing blocks in output, if a block does not have any targeting branch, the block can be omitted. The relevant logic is here. This is how we've actually managed to not print the block within catch.

But this can break. When the original code looks like

try
  ...
catch
  drop
  br 0
end

Here br 0 inside catch means it wants to jump to right after the end of try. When we read the binary, we add a block inside try and catch. So this will be read in Binaryen IR like this:

(try
  (do
    (block $label0
      ...
    )
  )
  (catch
    (block $label1  ;; cannot be deleted because there's a br targeting this!
      (drop (pop exnref))
      (br $label1)
    )
  )
)

So br 0 now accidentally targets the inner block $label1 within catch, making block $label1 not deletable when printing binary.

This is also what happens for if-else. But for if-else, this code is actually correct, so no problem:
Original code:

i32.const 0
if
  br 0
else
  br 0
end

Corresponding Binaryen IR:

(if
  (i32.const 0)
  (then
    (block $label0
      (br $label0)
    )
  (else
    (block $label1
      (br $label1)
    )
  )
)

By the way, I'm wondering, even though this code is semantically correct for if-else, this increases code size. Is there an optimization that reduces the number of instructions here? @kripken

Oh and by the way, even if we change the EH spec soon, this problem is not gonna go away; currently catch pushes exnref, but in the new version catch will push extracted values, so anyway it will push something.


So there can be multiple ways to solve this.

  1. Make try take a label. I'm not very sure how much work this needs, but I've seen many places that handles label names of blocks and loops. We're gonna need to add try handling to each of those places. ;(
  2. Support block input parameter. @tlively, what's the timeline for this?
    But actually I don't like this option, because this makes EH feature dependent on multivalue feature. Even though the current proposal says the EH proposal depends on the multivalue proposal, this is because br_on_exn can extract multiple values, and C++ does not necessarily need this, so in practice we can still use EH on many tools and VMs without depending on multivalue. But this option makes even C++ EH programs dependent on multivalue feature.
  3. Do some hacky things for try-catch somewhere in binary reader, so that if there's a block within catch that has a br targeting it, we can convert something like
(try
  ...
  (catch
    (block $label1
      (drop (pop exnref))
      (br $label1)
    )
  )
)

to this: (this is actually how we handle a similar situation in the s expression parser for ifs and trys...)

(block $label1
  (try
    ...
    (catch
      (drop (pop exnref))
      (br $label1)
    )
  )
)

Anyway, sorry for the long comment. This is not urgent; The EH spec is in churn anyway, so this does not have to be resolved in short order. But I guess we have to resolve it one way or the other at some point.

@tlively
Copy link
Member

tlively commented Sep 12, 2020

I wonder if we could do something hacky in the binary writer to omit the block in the catch even if it is a branch target. In the emitted binary, all the branches that target the block would then target the try instead. The trick would be to adjust the emitted targets of all branches in the catch that try to branch to a label outside the try-catch to account for the missing block in the binary.

I don't see supporting block parameters in the IR as a high priority since the code size improvements they unlock are so modest. I would like to get around to them eventually as a matter of completeness, though, and if there are more situations like this where we wish we had them, that would make them more important. So right now I wouldn't expect them soon.

@aheejin
Copy link
Member

aheejin commented Sep 13, 2020

In the emitted binary, all the branches that target the block would then target the try instead. The trick would be to adjust the emitted targets of all branches in the catch that try to branch to a label outside the try-catch to account for the missing block in the binary.

try cannot be a branch target in the current Binaryen, as well as if. This is the reason I suggested something like this:

(block $label1
  (try
    ...
    (catch
      (drop (pop exnref))
      (br $label1)
    )
  )
)

Yeah but this adds two unnecessary instructions. I think if has a similar problem (not for binary but for wast. Also if maintains blocks within true/false body if they are branch targets), so I wonder if there's any optimization that fixes this.

Or we can do even a hackier thing in binary writer so that we can make the branch target try even if it cannot be a target within Binaryen IR... (was this what you meant @tlively?) But this will be harder and hackier.

But anyway, try is much rarer than if, so the optimization is not an important issue.

@tlively
Copy link
Member

tlively commented Sep 13, 2020

Or we can do even a hackier thing in binary writer so that we can make the branch target try even if it cannot be a target within Binaryen IR... (was this what you meant @tlively?) But this will be harder and hackier.

Yeah, that's what I was thinking :) Definitely a hack, but would avoid needing to depend on block params and multivalue. The code transformation you propose would work, but without the hack in the BinaryenIRWriter, we'd be losing the property that any valid Binaryen IR can be written directly to a valid WebAssembly (since it would have to go through that transformation first).

@aheejin
Copy link
Member

aheejin commented Sep 13, 2020

The code transformation you propose would work, but without the hack in the BinaryenIRWriter, we'd be losing the property that any valid Binaryen IR can be written directly to a valid WebAssembly (since it would have to go through that transformation first).

Not sure if I understand. What I suggested is a hack in the binary reader side, not the writer side. We can make the reader create that kind of structure, and after that every valid Binaryen IR can be written to a valid WebAssembly. We do a somewhat similar thing for if and try in s expression parser here.

@tlively
Copy link
Member

tlively commented Sep 13, 2020

Aha, that makes sense. That still means it would be possible to programmatically create valid Binaryen IR that is written to an invalid binary, but I don't think that's a big problem given how niche this edge case is.

@aheejin aheejin self-assigned this Sep 13, 2020
@kripken
Copy link
Member Author

kripken commented Sep 13, 2020

I think it would be fine to emit an outer block,

(block $label1
  (try

With that I think the parser could make sure to not emit a block inside the catch that has a break to it. And optimizations could remove it when possible, MergeBlocks tries to move blocks "outside" as much as possible (so they can be merged), and RemovedUnusedBrs has a bunch of patterns for removing unneeded branches and blocks that they appear to target. That should handle ifs properly today. We may want to add some patterns to those passes for try-catch if necessary (but optimizing it seems not urgent).

aheejin added a commit to aheejin/binaryen that referenced this issue Sep 14, 2020
aheejin added a commit that referenced this issue Sep 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants