-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[WebAssembly] Add unreachable before catch destinations #123915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
When `try_table`'s catch clause's destination has a return type, as in the case of catch with a concrete tag, catch_ref, and catch_all_ref. For example: ```wasm block exnref try_table (catch_all_ref 0) ... end_try_table end_block ``` ... use exnref ... This code is not valid because the block's body type is not exnref. So we add an unreachable after the 'end_try_table' to make the code valid here: ```wasm block exnref try_table (catch_all_ref 0) ... end_try_table unreachable ;; Newly added end_block ``` Because 'unreachable' is a terminator we also need to split the BB. --- We need to handle the same thing for unwind mismatch handling. In the code below, we create a "trampoline BB" that will be the destination for the nested `try_table`~`end_try_table` added to fix a unwind mismatch: ```wasm try_table (catch ... ) block exnref ... try_table (catch_all_ref N) some code end_try_table ... end_block ;; Trampoline BB throw_ref end_try_table ``` While the `block` added for the trampoline BB has the return type `exnref`, its body, which contains the nested `try_table` and other code, wouldn't have the `exnref` return type. Most times it didn't become a problem because the block's body ended with something like `br` or `return`, but that may not always be the case, especially when there is a loop. So we add an `unreachable` to make the code valid here too: ```wasm try_table (catch ... ) block exnref ... try_table (catch_all_ref N) some code end_try_table ... unreachable ;; Newly added end_block ;; Trampoline BB throw_ref end_try_table ``` In this case we just append the `unreachable` at the end of the layout predecessor BB. (This was tricky to do in the first (non-mismatch) case because there `end_try_table` and `end_block` were added in the beginning of an EH pad in `placeTryTableMarker` and moving `end_try_table` and the new `unreachable` to the previous BB caused other problems.) --- This adds many `unreaachable`s to the output, but this adds `unreachable` to only a few places to see if this is working. The FileCheck lines in `exception.ll` and `cfg-stackify-eh.ll` are already heavily redacted to only leave important control-flow instructions, so I don't think it's worth adding `unreachable`s everywhere.
@llvm/pr-subscribers-backend-webassembly Author: Heejin Ahn (aheejin) ChangesWhen block exnref
try_table (catch_all_ref 0)
...
end_try_table
end_block ... use exnref ... This code is not valid because the block's body type is not exnref. So we add an unreachable after the 'end_try_table' to make the code valid here: block exnref
try_table (catch_all_ref 0)
...
end_try_table
unreachable ;; Newly added
end_block Because 'unreachable' is a terminator we also need to split the BB. We need to handle the same thing for unwind mismatch handling. In the code below, we create a "trampoline BB" that will be the destination for the nested try_table (catch ... )
block exnref
...
try_table (catch_all_ref N)
some code
end_try_table
...
end_block ;; Trampoline BB
throw_ref
end_try_table While the try_table (catch ... )
block exnref
...
try_table (catch_all_ref N)
some code
end_try_table
...
unreachable ;; Newly added
end_block ;; Trampoline BB
throw_ref
end_try_table In this case we just append the This adds many Full diff: https://github.com/llvm/llvm-project/pull/123915.diff 5 Files Affected:
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp
index 6cae0e766dbc02..da4b8a453632bc 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp
@@ -1297,6 +1297,7 @@ void WebAssemblyCFGStackify::addNestedTryDelegate(
// some code
// end_try_table
// ...
+// unreachable
// end_block ;; Trampoline BB
// throw_ref
// end_try_table
@@ -1358,6 +1359,13 @@ WebAssemblyCFGStackify::getTrampolineBlock(MachineBasicBlock *UnwindDest) {
BuildMI(TrampolineBB, EndDebugLoc, TII.get(WebAssembly::THROW_REF))
.addReg(ExnReg);
+ // The trampoline BB's return type is exnref because it is a target of
+ // catch_all_ref. But the body type of the blodk we just created is not. We
+ // add an 'unreachable' right before the 'end_block' to make the code valid.
+ MachineBasicBlock *TrampolineLayoutPred = TrampolineBB->getPrevNode();
+ BuildMI(TrampolineLayoutPred, TrampolineLayoutPred->findBranchDebugLoc(),
+ TII.get(WebAssembly::UNREACHABLE));
+
registerScope(Block, EndBlock);
UnwindDestToTrampoline[UnwindDest] = TrampolineBB;
return TrampolineBB;
@@ -1465,7 +1473,7 @@ void WebAssemblyCFGStackify::addNestedTryTable(MachineInstr *RangeBegin,
// - After:
// pre_bb: (new)
// range_end
- // end_try_table: (new)
+ // end_try_table_bb: (new)
// end_try_table
// post_bb: (previous 'ehpad')
// catch
@@ -1523,9 +1531,9 @@ void WebAssemblyCFGStackify::addNestedTryTable(MachineInstr *RangeBegin,
// end_loop
// end_try_table
//
-// So if the unwind dest BB has a end_loop before an end_try_table, we split the
-// BB with the end_loop as a separate BB before the end_try_table BB, so that
-// after we fix the unwind mismatch, the code will be like:
+// So if an end_try_table BB has an end_loop before the end_try_table, we split
+// the BB with the end_loop as a separate BB before the end_try_table BB, so
+// that after we fix the unwind mismatch, the code will be like:
// bb0:
// try_table
// block exnref
@@ -1538,10 +1546,10 @@ void WebAssemblyCFGStackify::addNestedTryTable(MachineInstr *RangeBegin,
// end_block
// end_try_table_bb:
// end_try_table
-static void splitEndLoopBB(MachineBasicBlock *UnwindDest) {
- auto &MF = *UnwindDest->getParent();
+static void splitEndLoopBB(MachineBasicBlock *EndTryTableBB) {
+ auto &MF = *EndTryTableBB->getParent();
MachineInstr *EndTryTable = nullptr, *EndLoop = nullptr;
- for (auto &MI : reverse(*UnwindDest)) {
+ for (auto &MI : reverse(*EndTryTableBB)) {
if (MI.getOpcode() == WebAssembly::END_TRY_TABLE) {
EndTryTable = &MI;
continue;
@@ -1555,11 +1563,11 @@ static void splitEndLoopBB(MachineBasicBlock *UnwindDest) {
return;
auto *EndLoopBB = MF.CreateMachineBasicBlock();
- MF.insert(UnwindDest->getIterator(), EndLoopBB);
+ MF.insert(EndTryTableBB->getIterator(), EndLoopBB);
auto SplitPos = std::next(EndLoop->getIterator());
- EndLoopBB->splice(EndLoopBB->end(), UnwindDest, UnwindDest->begin(),
+ EndLoopBB->splice(EndLoopBB->end(), EndTryTableBB, EndTryTableBB->begin(),
SplitPos);
- EndLoopBB->addSuccessor(UnwindDest);
+ EndLoopBB->addSuccessor(EndTryTableBB);
}
bool WebAssemblyCFGStackify::fixCallUnwindMismatches(MachineFunction &MF) {
@@ -1943,8 +1951,16 @@ bool WebAssemblyCFGStackify::fixCallUnwindMismatches(MachineFunction &MF) {
// When end_loop is before end_try_table within the same BB in unwind
// destinations, we should split the end_loop into another BB.
if (!WebAssembly::WasmUseLegacyEH)
- for (auto &[UnwindDest, _] : UnwindDestToTryRanges)
- splitEndLoopBB(UnwindDest);
+ for (auto &[UnwindDest, _] : UnwindDestToTryRanges) {
+ auto It = EHPadToTry.find(UnwindDest);
+ // If UnwindDest is the faker caller block, it will not be in EHPadToTry
+ // map
+ if (It != EHPadToTry.end()) {
+ auto *TryTable = It->second;
+ auto *EndTryTable = BeginToEnd[TryTable];
+ splitEndLoopBB(EndTryTable->getParent());
+ }
+ }
// Now we fix the mismatches by wrapping calls with inner try-delegates.
for (auto &P : UnwindDestToTryRanges) {
@@ -2179,8 +2195,15 @@ bool WebAssemblyCFGStackify::fixCatchUnwindMismatches(MachineFunction &MF) {
// When end_loop is before end_try_table within the same BB in unwind
// destinations, we should split the end_loop into another BB.
- for (auto &[_, UnwindDest] : EHPadToUnwindDest)
- splitEndLoopBB(UnwindDest);
+ for (auto &[_, UnwindDest] : EHPadToUnwindDest) {
+ auto It = EHPadToTry.find(UnwindDest);
+ // If UnwindDest is the faker caller block, it will not be in EHPadToTry map
+ if (It != EHPadToTry.end()) {
+ auto *TryTable = It->second;
+ auto *EndTryTable = BeginToEnd[TryTable];
+ splitEndLoopBB(EndTryTable->getParent());
+ }
+ }
NumCatchUnwindMismatches += EHPadToUnwindDest.size();
SmallPtrSet<MachineBasicBlock *, 4> NewEndTryBBs;
@@ -2372,6 +2395,48 @@ static void appendEndToFunction(MachineFunction &MF,
TII.get(WebAssembly::END_FUNCTION));
}
+// We added block~end_block and try_table~end_try_table markers in
+// placeTryTableMarker. But When catch clause's destination has a return type,
+// as in the case of catch with a concrete tag, catch_ref, and catch_all_ref.
+// For example:
+// block exnref
+// try_table (catch_all_ref 0)
+// ...
+// end_try_table
+// end_block
+// ... use exnref ...
+//
+// This code is not valid because the block's body type is not exnref. So we add
+// an unreachable after the 'end_try_table' to make the code valid here:
+// block exnref
+// try_table (catch_all_ref 0)
+// ...
+// end_try_table
+// unreachable (new)
+// end_block
+//
+// Because 'unreachable' is a terminator we also need to split the BB.
+static void addUnreachableAfterTryTables(MachineFunction &MF,
+ const WebAssemblyInstrInfo &TII) {
+ std::vector<MachineInstr *> EndTryTables;
+ for (auto &MBB : MF)
+ for (auto &MI : MBB)
+ if (MI.getOpcode() == WebAssembly::END_TRY_TABLE)
+ EndTryTables.push_back(&MI);
+
+ for (auto *EndTryTable : EndTryTables) {
+ auto *MBB = EndTryTable->getParent();
+ auto *NewEndTryTableBB = MF.CreateMachineBasicBlock();
+ MF.insert(MBB->getIterator(), NewEndTryTableBB);
+ auto SplitPos = std::next(EndTryTable->getIterator());
+ NewEndTryTableBB->splice(NewEndTryTableBB->end(), MBB, MBB->begin(),
+ SplitPos);
+ NewEndTryTableBB->addSuccessor(MBB);
+ BuildMI(NewEndTryTableBB, EndTryTable->getDebugLoc(),
+ TII.get(WebAssembly::UNREACHABLE));
+ }
+}
+
/// Insert BLOCK/LOOP/TRY/TRY_TABLE markers at appropriate places.
void WebAssemblyCFGStackify::placeMarkers(MachineFunction &MF) {
// We allocate one more than the number of blocks in the function to
@@ -2398,13 +2463,17 @@ void WebAssemblyCFGStackify::placeMarkers(MachineFunction &MF) {
}
}
- // Fix mismatches in unwind destinations induced by linearizing the code.
if (MCAI->getExceptionHandlingType() == ExceptionHandling::Wasm &&
MF.getFunction().hasPersonalityFn()) {
- bool MismatchFixed = fixCallUnwindMismatches(MF);
- MismatchFixed |= fixCatchUnwindMismatches(MF);
- if (MismatchFixed)
- recalculateScopeTops(MF);
+ const auto &TII = *MF.getSubtarget<WebAssemblySubtarget>().getInstrInfo();
+ // Add an 'unreachable' after 'end_try_table's.
+ addUnreachableAfterTryTables(MF, TII);
+ // Fix mismatches in unwind destinations induced by linearizing the code.
+ fixCallUnwindMismatches(MF);
+ fixCatchUnwindMismatches(MF);
+ // addUnreachableAfterTryTables and fixUnwindMismatches create new BBs, so
+ // we need to recalculate ScopeTops.
+ recalculateScopeTops(MF);
}
}
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyInstrControl.td b/llvm/lib/Target/WebAssembly/WebAssemblyInstrControl.td
index b68dd8809bb920..ed6aec1ab33e3f 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyInstrControl.td
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyInstrControl.td
@@ -144,7 +144,7 @@ defm THROW_REF : I<(outs), (ins EXNREF:$exn), (outs), (ins), [],
"throw_ref \t$exn", "throw_ref", 0x0a>;
} // isTerminator = 1, hasCtrlDep = 1, isBarrier = 1
-// Region within which an exception is caught: try / end_try
+// Region within which an exception is caught: try_table / end_try_table
let Uses = [VALUE_STACK], Defs = [VALUE_STACK] in {
defm TRY_TABLE : I<(outs), (ins Signature:$sig, variable_ops),
(outs), (ins Signature:$sig, catch_list:$cal), [],
diff --git a/llvm/test/CodeGen/WebAssembly/cfg-stackify-eh.ll b/llvm/test/CodeGen/WebAssembly/cfg-stackify-eh.ll
index 683b03d16d57bd..98de9a267b95a5 100644
--- a/llvm/test/CodeGen/WebAssembly/cfg-stackify-eh.ll
+++ b/llvm/test/CodeGen/WebAssembly/cfg-stackify-eh.ll
@@ -857,6 +857,7 @@ invoke.cont: ; preds = %entry
; NOSORT: loop
; NOSORT: call foo
; NOSORT: end_loop
+; NOSORT: unreachable
; NOSORT: end_block # label[[L3]]:
; NOSORT: throw_ref
; NOSORT: end_try_table
diff --git a/llvm/test/CodeGen/WebAssembly/exception-legacy.mir b/llvm/test/CodeGen/WebAssembly/exception-legacy.mir
index d6f734c64acd69..9273ceeadd0e73 100644
--- a/llvm/test/CodeGen/WebAssembly/exception-legacy.mir
+++ b/llvm/test/CodeGen/WebAssembly/exception-legacy.mir
@@ -78,7 +78,8 @@ body: |
EH_LABEL <mcsymbol .Ltmp2>
CATCHRET %bb.2, %bb.1, implicit-def dead $arguments
- ; CHECK: bb.2
+ ; This BB should remain (it will be renumbered to bb.1)
+ ; CHECK: bb.1
bb.2:
; predecessors: %bb.0, %bb.1
RETURN implicit-def dead $arguments
diff --git a/llvm/test/CodeGen/WebAssembly/exception.ll b/llvm/test/CodeGen/WebAssembly/exception.ll
index d6f3ffc8c33cb1..304664b622e800 100644
--- a/llvm/test/CodeGen/WebAssembly/exception.ll
+++ b/llvm/test/CodeGen/WebAssembly/exception.ll
@@ -38,6 +38,7 @@ define void @throw(ptr %p) {
; CHECK: call foo
; CHECK: br 2
; CHECK: end_try_table
+; CHECK: unreachable
; CHECK: end_block
; CHECK: local.set 2
; CHECK: local.get 0
|
This problem was not discovered when I was testing the new LLVM implementation with Emscripten test suite last year because binaryen masked this problem. This is invalid but Binaryen accepts it: block i32
try_table (catch_all_ref 0)
...
end_try_table
end_block And when emiting out binaries after processing, it adds an And in Emscripten, It's kind of funny that I spent nontrivial amount of time overhauling the assembly type checker and didn't even run it on the CodeGen test output, because if I had run it on the test output it would've been caught right away... |
for (auto &[UnwindDest, _] : UnwindDestToTryRanges) { | ||
auto It = EHPadToTry.find(UnwindDest); | ||
// If UnwindDest is the faker caller block, it will not be in EHPadToTry | ||
// map | ||
if (It != EHPadToTry.end()) { | ||
auto *TryTable = It->second; | ||
auto *EndTryTable = BeginToEnd[TryTable]; | ||
splitEndLoopBB(EndTryTable->getParent()); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This changed because after splitting the BB in addUnreachableAfterTryTables
to add an unreachable
, end_try_table
is not in the EHPad anymore.
- Before:
bb:
... try body ...
ehpad:
end_try_table
end_block
... catch handler ...
- After:
bb:
... try body ...
end_try_table_bb:
end_try_table
unreachable
ehpad:
end_block
... catch handler ...
Line 2198~2206 change in fixCatchUnwindMismatches
is the same thing. splitEndLoopBB
's argument name changed from UnwindDest
to EndTryTableBB
for the same reason.
; This BB should remain (it will be renumbered to bb.1) | ||
; CHECK: bb.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This renumbering happened because now we run recalculateScopeTops
every time when EH is used. We used to run it only when unwind mismatches existed. Now we run it every time because adding unreachable
after end_try_table
requires splitting BBs and CFG changes.
This adds a new `WASM` mode in `ExceptionLibrary`, which uses the new standard Wasm (exnref) EH, adds a library variant for it, and make the tests use the new LLVM implementation instead of the Binaryen translator. The CI won't pass until llvm/llvm-project#123915 lands.
I think we should consider this a bug in Binaryen. When it parses Wasm input, it should reject invalid wasm unless there is some option enabled that relaxes the validation or accepts some input that isn't really wasm (e.g. if using the relooper). |
Co-authored-by: Derek Schuff <[email protected]>
This adds a new `WASM` mode in `ExceptionLibrary`, which uses the new standard Wasm (exnref) EH, adds a library variant for it, and make the tests use the new LLVM implementation instead of the Binaryen translator. The CI won't pass until llvm/llvm-project#123915 lands. This requires adding `-wasmexcept` and `-wasmsjlj` variants to `MINIMAL_TASKS` in `embuilder.py` because they are necessary to run coreX tests in CircleCI, because dylink coreX tests in CircleCI rellies on `./embuilder build MINIMAL_PIC --pic`, which is the sum of `MINIMAL_TASKS` and `MINIMAL_PIC_TASKS`. Because a new variant is added to `ExceptionLibrary`, this increases `SYSTEM` library size from 366M to 400M, roughly by 11%. We discussed about the possibility of not shipping the exnref libraries and let them be built on demand, but given that `SYSTEM` include all variants, I'm not sure how we are going to do that: https://github.com/emscripten-core/emscripten/blob/73ebb91029948e62b3a4cea9ccc8db2dd87162b5/embuilder.py#L174-L177
When
try_table
's catch clause's destination has a return type, as in the case of catch with a concrete tag, catch_ref, and catch_all_ref. For example:This code is not valid because the block's body type is not exnref. So we add an unreachable after the 'end_try_table' to make the code valid here:
Because 'unreachable' is a terminator we also need to split the BB.
We need to handle the same thing for unwind mismatch handling. In the code below, we create a "trampoline BB" that will be the destination for the nested
try_table
~end_try_table
added to fix a unwind mismatch:While the
block
added for the trampoline BB has the return typeexnref
, its body, which contains the nestedtry_table
and other code, wouldn't have theexnref
return type. Most times it didn't become a problem because the block's body ended with something likebr
orreturn
, but that may not always be the case, especially when there is a loop. So we add anunreachable
to make the code valid here too:In this case we just append the
unreachable
at the end of the layout predecessor BB. (This was tricky to do in the first (non-mismatch) case because thereend_try_table
andend_block
were added in the beginning of an EH pad inplaceTryTableMarker
and movingend_try_table
and the newunreachable
to the previous BB caused other problems.)This adds many
unreaachable
s to the output, but this addsunreachable
to only a few places to see if this is working. The FileCheck lines inexception.ll
andcfg-stackify-eh.ll
are already heavily redacted to only leave important control-flow instructions, so I don't think it's worth addingunreachable
s everywhere.