-
Notifications
You must be signed in to change notification settings - Fork 36
Initial draft of the layer 1. #33
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,390 @@ | ||
# Layer 1 exception handling | ||
|
||
Layer 1 of exception handling is the MVP (minimal viable proposal) for | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't quite understand how to interpret the concept of "layer" here. Will other layers be future extensions? If so, we at least have to make sure that the features proposed here are future-proof wrt to such extensions. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll follow up with a PR to clarify this. Sort of like we did in the host bindings discussion, we realized we can separate out some of the aspects of exception handling into what's absolutely necessary to support C++, and things that are useful in that other languages might use them or they might allow VMs to do a more efficient implementation. Part of the goal in separating these out (and documenting them, I'll follow up with a document listing the extensions today), is to make sure our MVP is future-proof wrt such extensions. In the interest of keeping the spec small and growing conservatively, my default would be to get the exception MVP into the spec and then follow up with the extensions as needed. I'm open to arguments for including the extensions as well though. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fully agreed. It was mainly the term "layer" and some of the explanation that made it sound like other things would be higher-level abstractions being added on top, which would be a quite different scenario. |
||
implementing exceptions in WebAssembly. As such, it doesn't include higher-level | ||
concepts and structures. These concept and structures are introduced in later | ||
layers, and either: | ||
|
||
1. Improves readability by combining concepts in layer 1 into higher-level | ||
constructs, thereby reducing code size. | ||
|
||
2. Allow performance improvements in the VM. | ||
|
||
3. Introduce additional new functionality not available in layer 1. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We have a list here of what the proposal does not include, but it'd be good to include the requirements for the MVP. To me, these include:
The performance goal could be made more precise. I think it's roughly "code that does not use exceptions does not run any slower once the exception proposal has been adopted." There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I opened #34 to track this. |
||
## Overview | ||
|
||
Exception handling allows code to break control flow when an exception is | ||
thrown. The exeception can be any exception known by the WebAssembly module, or | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. typo: exeception |
||
it may an unknown exception that was thrown by a called imported function. | ||
|
||
One of the problems with exception handling is that both WebAssembly and the | ||
host VM probably have different notions of what exceptions are, but both must be | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: I'd probably say "embedder" rather than "host VM" here and elsewhere. |
||
aware of the other. | ||
|
||
It is difficult to define exceptions in WebAssembly because (in general) | ||
it doesn't have knowledge of the host VM. Further, adding such knowledge to | ||
WebAssembly would limit the ability for other host VMs to support WebAssembly | ||
exceptions. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For the first "host VM" here, it feels like you're talking about the JavaScript VM. Otherwise, it seems like adding exceptions to other host VMs would be no more limited than the host VM. |
||
|
||
One issue is that both sides need to know if an exception was thrown by the | ||
other, because cleanup may need to be performed. | ||
|
||
Another problem is that WebAssembly doesn't have direct access to the host VM's | ||
memory. As a result, WebAssembly defers the handling of exceptions to the host | ||
VM. | ||
|
||
To access exceptions, WebAssembly provides instructions to check if the | ||
exception is one that WebAssembly understands. If so, the data of the | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It would probably be helpful to make a distinction between "WebAssembly, the spec" and "WebAssembly, the program that is running embedded in a host VM." For the former, I'd suggest just WebAssembly, and for the latter I'd suggest "user code" or "user WebAssembly code." |
||
WebAssembly exceptions's data is extracted and copied onto the stack, allowing | ||
succeeding instructions to process the data. | ||
|
||
Lastly, exception lifetimes must be maintained by the host VM, so that it can | ||
collect and reuse the memory used by exceptions. This implies that the host must | ||
know where exceptions are stored, so that it can determine when an exception can | ||
be garbage collected. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we had more strict structure around throw/rethrow, in theory at least exceptions could have static life-times. Requiring them to be GC/refcounted seems necessary specifically to accomodate hosts that want to manage their own exceptions in a more dynamic fashion. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Remember, not all exception lifetimes are controlled by WebAssembly code. Exceptions thrown by the host must follow the lifetime expectation of the host. Hence, using static lifetimes limit this use case. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The biggest problem here is that we are trying to generate Wasm code from LLVM. LLVM exception code has been lowered to the point that it now longer has the usual exception structure (for example, landing pads may be combined, there's cleanup code to handle, etc.). With more structured Wasm instructions, we can probably implement C++'s semantics, but the tricky part is to be able to rethrow a foreign (i.e. JavaScript) exception once C++ is done with its handling code. I and others have advocated a couple of approaches that give us statically known exception lifetimes, but unfortunately these just haven't been workable for generating code from LLVM inputs. I'd love to be proven wrong, but this seems like a surprisingly hard problem. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a hard problem that eventually made us introduce the first-class exception object. tl;dr: CFG does not have enough original C++ scope information. While it may not be impossible we preserve that information from clang and carry that all the way to the backend, we kind of reached a conclusion that it is non-trivial and has the possibility of hindering optimization. #30 and #31 discussed this problem. |
||
|
||
This also implies that the host VM must provide a garbage collector for | ||
exceptions. For host VMs that have garbage collection (such as JavaScript), | ||
this is not a problem. | ||
|
||
However, not all host VMs may have a garbage collector. For this reason, | ||
WebAssembly exceptions are designed to allow the use of reference counters to | ||
perform the the garbage collection in the host VM. | ||
|
||
To do this, WebAssembly exceptions are immutable once created, to avoid cyclic | ||
data structures that can't be garbage collected. It also means that exceptions | ||
can't be stored into linear memory. The rationale for this is twofold: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The reason that exceptions can't be stored in linear memory is that they need to be abstract, whereas if you could store them in linear memory you could trivially load them under a different type to get access to the raw bits and underlying representation. Although, I realize if I'd just read a little further I'd see that you addressed this 😄 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the most important reason is missing from this list, namely that they may be pointers into the host's memory, and it would be fundamentally unsafe to expose these, misinterpretations of data layout aside. Host references have to be opaque types. |
||
|
||
* For security. Loads | ||
and stores do not guarantee that the data read was of the same type as | ||
stored. This allows spoofing of exception references that may allow a | ||
WebAssembly module to access data it should not know in the host VM. | ||
|
||
* The host VM does not know the layout of data in linear memory, so it can't | ||
find places where exception references are stored. | ||
|
||
Hence, while an exception reference is a new first class type, this proposal | ||
disallows their usage in linear memory. | ||
|
||
A WebAssembly exception is created by the `except` instruction. Once an | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can bikeshed the names later, but we should consider something like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Tracking in #35 |
||
exception is created, you can throw it with the `throw` instruction. Thrown | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This separation between creating an exception and throwing it is new. I think it is problematic on several levels, and not a future-proof design. It both gets in the way of certain optimisations / implementation strategies and is incompatible with possible future extensions like resumption (unless we make the type system more complicated). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We went back and forth on this and couldn't see a clear reason to pick one over the other. I find your resumption argument compelling though. |
||
exceptions are handled as follows: | ||
|
||
1. They can be caught by a catch block in an enclosing try block of a function | ||
body. The caught exception is pushed onto the stack. | ||
|
||
1. Throws not caught within a function body continue up the call stack, popping | ||
call frames, until an enclosing try block is found. | ||
|
||
1. If the call stack is exhausted without any enclosing try blocks, the host VM | ||
defines how to handle the uncaught exception. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we need to mention something here about interleaving host and guest frames? |
||
### Exceptions | ||
|
||
An `exception` is an internal construct in WebAssembly that is maintained by the | ||
host. WebAssembly exceptions (as opposed to host exceptions) are defined by a | ||
new `exception section` of a WebAssembly module. The exception section is a list | ||
of exception types, from which exceptions can be created. | ||
|
||
Each exception type has a `type signature`. The type signature defines the list | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Playing advocate of the devil here: since in the general case a C++ exception type cannot be represented as a self-contained wasm signature (e.g. if it contains a string) I'd expect the front-end to often or always opt for a signature that is just a single linear memory index. Similarly, a host exception will likely often be a single anyref, once we get that. This in turn does not explain to me why we need the generality of a type signature, other than, "why not, may be nice to have". There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Its more of the latter. This topic was brought up in many earlier CG meetings and the general consensus was that exceptions should not just be C++ specific, but more general so that other languages could be translated down to WebAssembly. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The most important rationale (which should probably be mentioned) is that Wasm is an open, heterogeneous environment. If you catch an exception you have no way of knowing who generated it and how the payload is to be interpreted, unless you can identify it via an unforgeable tag. There may be multiple language mappings running interleaved in the same computation, or even multiple instances of the same language "runtime" that use separate memories, so just having an uninterpreted pointer would be unreliable. |
||
of values associated with the exception. | ||
|
||
Within the module, exception types are identified by an index into the | ||
[exception index space](#exception-index-space). This index is referred to as | ||
the `exception tag`. The `tagged exception type` is the corresponding | ||
exception type refered to by the exception tag. | ||
|
||
Exception types can be imported and exported by adding the appropriate entries | ||
to the import and export sections of the module. All imported/exported exception | ||
types must be named to reconcile exception tags between modules. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is the same as it works for imported and exported functions, right? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What if my signature is simple, e.g. a single i32, does it mean that if I try to catch exceptions of this type, I will also catch exceptions from other modules that are unrelated? Do exceptions need some form of identifier to make them unique? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Each exception type has an opaque tag associated with it, so if my module declared Technically, I think these can alias, if they are both imported exceptions. The host may supply the same underlying exception to both imports, the same way that one could import the same function under two different names. |
||
|
||
Exception tags are used by: | ||
|
||
1. The `except` instruction which creates a WebAssembly instance of the | ||
corresponding tagged exception type, and pushes a reference to it onto the | ||
stack. | ||
|
||
2. The `if_except` instruction that queries an exception to see if it is an | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I like the name There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Tracking in #35 |
||
instance of the corresponding tagged exception class, and if true it pushes | ||
the corresponding values of the exception onto the stack. | ||
|
||
### The exception refernce data type | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. refernce -> reference There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. typo: refernce |
||
|
||
Data types are extended to have a new `except_ref` type. The representation of | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If I understand correctly, this proposal suggests using this type for both exception values and the data that represents a caught exception -- in some circles that is called an exception package. An exn package contains an exn value, but also additional information, like a stack trace, and potentially more with future extensions. Hence, the two must not be conflated. If you do, you hardwire at least two assumptions into the design: (1) exn values are complex objects that can have arbitrary data attached to them, and (2) exn values are mutable on some level. Even in a JavaScript embedding these assumptions do not hold in general. For example, IIRC, stack traces are only attached to Error objects in JS; if you throw e.g. a primitive value then you don't get a stack trace. We hence need to to distinguish two types, say In the same vein, throw and rethrow instructions need to be distinguished, see below. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In the current doc, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd been envisioning this as that from Wasm's perspective, there is only That said, I could see value in separating these things out, in case we wanted to use tagged values as first class data separate from exception handling (which someone will probably try to do anyway). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @rossberg - oops, we raced. I think we're in agreement in spirit, just using different terms. In my thinking, you do There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @eholk, a plain exn value makes no sense as an argument for e.g. rethrow, so you'd still want the distinction. They are really different things. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm getting confused. What is then the first-class wasm type? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. aheejin, if we did separate exception construction and throwing like in the current state of the proposal then both would be first-class. How either is represented is likely different for different embedders. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @rossberg, if we combine There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @eholk, yes, there would only be the latter. |
||
an exception type is left to the host VM, but its size must be fixed. The actual | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think whether the size is fixed can be left up to the host VM. One could imagine a representation where they have a length field followed by that number of bytes. In practice, these are always going to be word-sized pointers though. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I added this constraint because local variables/global variables may be allocated in a linear memory in the embedder, and hence would require that. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Because an except_ref can hold any type of exception, it should have a common fixed size. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think most reasonable implementations will have |
||
number of bytes it takes to represent any `except_ref` is left to the host VM. | ||
|
||
### Try and catch blocks | ||
|
||
A _try block_ defines a list of instructions that may need to process exceptions | ||
and/or clean up state when an exception is thrown. Like other higher-level | ||
constructs, a try block begins with a `try` instruction, and ends with an `end` | ||
instruction. That is, a try block is sequence of instructions having the | ||
following form: | ||
|
||
``` | ||
try block_type | ||
instruction* | ||
catch | ||
instruction* | ||
end | ||
``` | ||
|
||
A try block ends with a `catch block` that is defined by the list of | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This design separates catching exceptions from switching on (and unpacking) them. That is okay as long as the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Actually, I expected layer 2 to provide a clean static model like that of the original proposal. In that context, no garbage collection is needed because the structure defines the lifetime of the exceptions, and hence it can clean up when each catch is exitted. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @rossberg I'm not sure if I understand what you mean by the efficiency and the maximum cost of a catch-all. I think the reason we separated those two is because C++ is going to only use @KarlSchimpf So, you mean, we can use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @aheejin, yes, but C++ is an extremely degenerate case in that regard. No other language works that way. Usually you want to be able to jump to the responsible handler directly, because that is cheaper than executing various intermediate handlers only to discover that they are not relevant for the given exception. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What I was claim wasn't on the Adding these catch clauses to the try block I assumed would be after MVP. |
||
instructions after the `catch` instruction. | ||
|
||
Try blocks, like control-flow blocks, have a _block type_. The block type of a | ||
try block defines the values yielded by the evaluation the try block when either | ||
no exception is thrown, or the exception is successfully caught by the catch | ||
block. | ||
|
||
In the initial implementation, try blocks may only yield 0 or 1 values. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It sounds like we may get multi-value blocks pretty soon, so we don't we just do whatever the rest of Wasm does? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added #36 to discuss this relationship. |
||
|
||
### Exception creation | ||
|
||
A `except` instruction has a single immediate argument, an exception tag. The | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Separating out exception value creation from throwing is likely to make throw less efficient, because the compiler does not statically know what it is throwing. In the previous design, the compiler statically knew where to read off the handler to jump to (each exn could have a thread-local handler stack attached to it); with the separation it probably requires a hash table lookup. The separation will also cause problems with future extensions. In particular, when you consider resumption, the type of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I assumed (like return-call) that a throw that also creates could be added as new operator after MVP. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since the combined operation is the simpler and more extensible choice (e.g. avoids the need for an exn_ref type) it seems more natural to go the other way round. Is there an immediate use case for the separated instructions? |
||
corresponding tagged exception type is used to define the data fields of the | ||
created exception. The values for the data fields must be on top of the operand | ||
stack, and must correspond to the exception's type signature. These values are | ||
popped off the stack and an instance of the exception is then created. A | ||
reference to the created exception is then pushed onto the stack. | ||
|
||
### Throws | ||
|
||
The `throw` throws the exception on top of the stack. The exception is popped | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This design seems to eliminate
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I assumed that the stack trace was added when created. Hence, throw and rethrow are the same. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, but as I argued in another comment, that bakes in assumptions about the exception mechanism that do not generally hold. It also is plain incompatible with extensions like resumption, where (1) the continuation would have to be captured at the throw point, not the creation point, and (2) rethrow would not capture a continuation, just pass it on. |
||
off the top of the stack before throwing. | ||
|
||
When an exception is thrown, the host VM searches for nearest enclosing try | ||
block body that execution is in. That try block is called the _catching_ try | ||
block. | ||
|
||
If the throw appears within the body of a try block, it is the catching try | ||
block. | ||
|
||
If a throw occurs within a function body, and it doesn't appear inside the body | ||
of a try block, the throw continues up the call stack until it is in the body of | ||
an an enclosing try block, or the call stack is flushed. If the call stack is | ||
flushed, the host VM defines how to handle uncaught exceptions. Otherwise, the | ||
found enclosing try block is the catching try block. | ||
|
||
A throw inside the body of a catch block is never caught by the corresponding | ||
try block of the catch block, since instructions in the body of the catch block | ||
are not in the body of the try block. | ||
|
||
Once a catching try block is found for the throw, the operand stack is popped back | ||
to the size the operand stack had when the try block was entered, and then | ||
the caught exception is pushed onto the stack. | ||
|
||
If control is transferred to the body of a catch block, and the last instruction | ||
in the body is executed, control then exits the try block. | ||
|
||
If the selected catch block does not throw an exception, it must yield the | ||
value(s) expected by the corresponding catching try block. This includes popping | ||
the caught exception. | ||
|
||
Note that a caught exception can be rethrown using the `throw` instruction. | ||
|
||
### Exception data extraction | ||
|
||
The `if_except block` defines a conditional query of the exception on top of the | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This adds another type of control-flow block specifically for this purpose, which has an implementation cost. What alternatives have been considered, e.g. an instruction that loads the type or Nth element from an exception object, such that a the existing if block can be used to inspect it? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree that there is an implementation cost to the current proposal. On the other hand, there In the current vacuum of realistic examples, it is hard to say which is better, so one was chosen, based on previous discussions in the original proposal (in Exceptions.md). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the main advantage of the approach we've presented here is the guarantees provided by the type system. Let's say we had a
The problem is that at Instead, the combined match and unpack that we have with |
||
stack. The exception is not popped when queried. The if_except block has two | ||
subblocks, the `then` and `else` subblocks, like that of an `if` block. The then | ||
block is a sequence of instructions following the `if_except` instruction. The | ||
else block is optional, and if it appears, it begins with the `else` | ||
instruction. The scope of the if_except block is from the `if_except` | ||
instruction to the corresponding `end` instruction. | ||
|
||
That is, the forms of an if_except block is: | ||
|
||
``` | ||
if_except block_type except_index | ||
Instruction* | ||
end | ||
|
||
if_except block_type except_index | ||
Instruction* | ||
else | ||
Instructions* | ||
end | ||
``` | ||
|
||
The conditional query of an exception succeeds when the exception on the top of | ||
the stack is an instance of the corresponding tagged exception type (defined by | ||
`except_index`). | ||
|
||
If the query succeeds, the data values (associated with the type signature of | ||
the exception class) are extracted and pushed onto the stack, and control | ||
transfers to the instructions in the then block. | ||
|
||
If the query fails, it either enters the else block, or transfer control to the | ||
end of the if_except block if there is no else block. | ||
|
||
### Debugging | ||
|
||
Earlier discussion implied that when an exception is thrown, the runtime will | ||
pop the operand stack across function calls until a corresponding, enclosing try | ||
block is found. The implementation may actually not do this. Rather, it may | ||
first search up the call stack to see if there is an enclosing try. If none are | ||
found, it could terminate the thread at the point of the throw. This would | ||
allow better debugging capability, since the corresponding call stack is still | ||
there to query. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Alternatively, the embedder can capture a stack trace at the point the exception object is created, which I think is roughly what JavaScript will do. One question is whether stack traces should be captured at the point of creating the exception, or at the point of throwing the exception. I think this is irrelevant to the core spec, since stack traces are provided purely by the embedder and we provide no way to access them in WebAssembly. It's worth discussing in the context of the JS embedding though. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Discuss in #37 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't disagree with this. I only wanted to state that it is up to the host on how to best handle this, and, depending on whether it is compiled for debugging or release, do slightly different implementations. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sounds good. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we are going to do the two-phase unwinding, this may have to be incorporated into wasm spec, because the toolchain (more precisely, the library like libcxxabi) has to support this behavior. For example, for x86 EH, in the first phase, libcxxabi does not actually unwind the stack but just walks up the stack to see if there is a matching handler. If a matching handler is found, in the second phase it actually starts unwinding the stack until it reaches the handler found. If no matching handler is found, libcxxabi does not unwind the stack but just crashes at the point of throwing, leaving the whole stack trace intact. I'm not suggesting we do this now, and for Layer 1 I think we don't need to do this anyway. I'm just suggesting if we do this later, this can't be just left to the discretion of the host VM but explicitly needs to be specified in the spec. |
||
|
||
## Changes to the text format. | ||
|
||
This section describes change in the | ||
[instruction syntax document](https://github.com/WebAssembly/spec/blob/master/document/core/instructions.rst). | ||
|
||
### New instructions | ||
|
||
The following rules are added to *instructions*: | ||
|
||
``` | ||
try resulttype instructions* catch instructions* end | | ||
except except_index | | ||
throw | | ||
if_except resulttype except_index then Instructions* end | | ||
if_except resulttype except_index then Instructions* else Instructions* end | ||
``` | ||
|
||
Like the `block`, `loop`, and `if` instructions, the `try` and `if_except` | ||
instructions are *structured* control flow instructions, and can be | ||
labeled. This allows branch instructions to exit try and `if_except` blocks. | ||
|
||
The `except_index` of the `except` and `if_except` instructions defines the | ||
exception type to create/extract form. See [exception index | ||
space](#exception-index-space) for further clarification of exception tags. | ||
|
||
## Changes to Modules document. | ||
|
||
This section describes change in the | ||
[Modules document](https://github.com/WebAssembly/design/blob/master/Modules.md). | ||
|
||
### Exception index space | ||
|
||
The _exception index space_ indexes all imported and internally-defined | ||
exceptions, assigning monotonically-increasing indices based on the order | ||
defined in the import and exception sections. Thus, the index space starts at | ||
zero with imported exceptions followed by internally-defined exceptions in | ||
the [exception section](#exception-section). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
No, but they're similar. The exception tag is an opaque identifier for the type of the exception. The exception index points to an entry in the exception table, which has its own tag.
By wasm compiler, do you mean something like the C++ to Wasm compiler (as opposed to the Wasm to Native Code VM)? If so, then the linker seems like a reasonable place to reconcile all the exception indices. Ultimately, the notion of "the C++ exception type" does not really exist in the Wasm spec, but is a convention used by the tools. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Then an |
||
|
||
## Changes to the binary model | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. binery model -> binary encoding |
||
|
||
This section describes changes in | ||
the | ||
[binary encoding design document](https://github.com/WebAssembly/design/blob/master/BinaryEncoding.md). | ||
|
||
|
||
### Data Types | ||
|
||
#### except_ref | ||
|
||
An exception reference pointing to an instance of an exception. The size | ||
is fixed, but unknown in WebAssembly (the host defines the size in bytes). | ||
|
||
### Language Types | ||
|
||
| Opcode | Type constructor | | ||
|--------|------------------| | ||
| -0x41 | `except_ref` | | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd make this There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Well, anyfunc will become a value type as well eventually. I think it makes sense to put all the reference types close together. |
||
|
||
#### value_type | ||
|
||
A `varint7` indicating a a `value type` is extended to include `except_ref` as | ||
encoded above. | ||
|
||
#### Other Types | ||
|
||
##### except_type | ||
|
||
An exception is described by its exception type signature, which corresponds to | ||
the data fields of the exception. | ||
|
||
| Field | Type | Description | | ||
|-------|------|-------------| | ||
| `count` | `varuint32` | The number of types in the signature | | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To allow for future extension (esp resumption), we should change this in two ways:
(I apologise that I steered us away from 1 before, which is more similar to what @KarlSchimpf has suggested long ago. I didn't consider resumption at the time.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree with both comments. |
||
| `type` | `value_type*` | The type of each element in the signature | | ||
|
||
|
||
##### external_kind | ||
|
||
A single-byte unsigned integer indicating the kind of definition being imported | ||
or defined: | ||
|
||
* `0` indicating a `Function` [import](Modules.md#imports) or [definition](Modules.md#function-and-code-sections) | ||
* `1` indicating a `Table` [import](Modules.md#imports) or [definition](Modules.md#table-section) | ||
* `2` indicating a `Memory` [import](Modules.md#imports) or [definition](Modules.md#linear-memory-section) | ||
* `3` indicating a `Global` [import](Modules.md#imports) or [definition](Modules.md#global-section) | ||
* `4` indicating an `Exception` [import](#import-section) or [definition](#exception-sectio) | ||
|
||
### Module structure | ||
|
||
#### High-level structure | ||
|
||
A new `exception` section is introduced and is named `exception`. If included, | ||
it must appear between the `Export` and `Start` sections of the module. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why after export? Since exceptions themselves can be exported it would be natural that it appears before the export section. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good point. They should be moved before exports, and maybe before imports. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why before imports? ;) That would screw up the order of the index space. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My bad. I didn't realize that a section id number was defined for an exception section. |
||
|
||
|
||
##### Exception section | ||
|
||
The `exception` section is the named section 'exception'. The exception section | ||
declares exception types using exception type signatures. | ||
|
||
| Field | Type | Description | | ||
|-------|------|-------------| | ||
| count | `varuint32` | count of the number of exceptions to follow | | ||
| sig | `except_type*` | The type signature of the data fields for each exception | | ||
|
||
|
||
##### Import section | ||
|
||
The import section is extended to include exception types by extending an | ||
`import_entry` as follows: | ||
|
||
If the `kind` is `Exception`: | ||
|
||
| Field | Type | Description | | ||
|-------|------|-------------| | ||
| `sig` | `except_type` | the type signature of the exception | | ||
|
||
##### Export section | ||
|
||
The export section is extended to include exception types by extending an | ||
`export_entry` as follows: | ||
|
||
If the `kind` is `Exception`, then the `index` is into the corresponding | ||
exception in the [exception index space](#exception-index-space). | ||
|
||
|
||
##### Name section | ||
|
||
The set of known values for `name_type` of a name section is extended as | ||
follows: | ||
|
||
| Name Type | Code | Description | | ||
| --------- | ---- | ----------- | | ||
| [Function](#function-names) | `1` | Assigns names to functions | | ||
| [Local](#local-names) | `2` | Assigns names to locals in functions | | ||
| [Exception](#exception-names) | `3` | Assigns names to exception types | | ||
|
||
###### Exception names | ||
|
||
The exception names subsection is a `name_map` which assigns names to a subset | ||
of the _exception_ indices from the exception section. (Used for both imports | ||
and module-defined). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I can be mistaken, but can there be any debug info attached to not an actual exception object but an exception tag? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You could attach a name to an exception tag, which could count as very basic debug info, but in general I'd say no. Runtime info like a stack trace can only be attached to an exception object. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Then what can the capability of attaching a name to an exception index be used for? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It'd be similar to attaching a name to a function. Without names, Wasm call stacks basically look like "wasm_function_0, wasm_function_1, wasm_function_1, ...." In the same way, names on exceptions would let devtools say something like "Unhandled exception: foo" rather than "Unhandled Exception: my_module.wasm:0" or worse, "Unhandled exception: <wasm exception 0x4ea2df>". There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What you describe is we attach debug info to exception objects. But if we assign names to exception indices, that doesn't mean individual exception objects can have separate names, no? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Correct, this would just be a name for the exception type, not the individual object. |
||
|
||
### Control flow operators | ||
|
||
The control flow operators are extended to define try blocks, catch blocks, | ||
throws, and rethrows as follows: | ||
|
||
| Name | Opcode | Immediates | Description | | ||
| ---- | ---- | ---- | ---- | | ||
| `try` | `0x06` | sig : `block_type` | begins a block which can handle thrown exceptions | | ||
| `catch` | `0x07` | | begins the catch block of the try block | | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So this is supposed to be There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed with @aheejin. OTOH, we may not even need to distinguish the two. Instead, catch has a list of exception ids that defines what it catches. In the degenerate case that it is empty it is a catch-all. In the MVP, we could then require it to be empty (although I don't necessarily think we should). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So to make sure I understand the idea, you mean have something like:
I kind of like that idea (although we'd probably want to require disjoint exception indices for each catch clause, and aliasing might be a problem). Or did you mean have a single catch clause overall? The reason for allowing only catch all for MVP is that our main consumer, C++, doesn't really have any use for non-catch_all. I think it makes sense to not spec and implement a feature until we have users for it. Are there other languages that are interested in being among the first exception users? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I meant still just a single exception clause, but annotated with an (optional) list of exception ids. That should give us the best of both worlds: simple instruction structure and efficient implementation. Why does C++ not have use for specific catch? I would assume that any catch that is not a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we'd still want to be able to associate different code with different catch clauses. For example, say my language didn't follow the one exception type per language convention, but instead had Wasm exceptions for things like The issue with C++ and specific catch was discussed some in #31. @aheejin can probably explain the issues better but I think it comes down to a combination of LLVM's exception handling being very unstructured, and needing There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree that a combined catch/match makes certain optimisations easier, but I'm not sure how big a deal it is to do a data flow analysis instead. OTOH, the separation allows for more flexible control flow, which I thought was the motivation? Subtyping on C++ exns seems like an orthogonal issue -- they'd still all have the same signle exn tag, don't they? But I can see how the destructor business makes this mostly a moot point in C++. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The goal is to get both. Keep the dynamic exception matching for cases where you need to more flexible control flow, but allow the more efficient static version when it's sufficient. I'm not sure how valuable this is though, and the complexity is significantly higher to support both in the spec. As you point out, a dataflow analysis would work, and I expect this would be a local analysis making it cheap to do (you wouldn't have to examine the whole program before you start compiling). I see what you mean about C++ exceptions. If there are no destructors, then there's no reason to handle JavaScript exceptions at all unless there's a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the idea of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I can't think of any instructions that currently do this, but there's nothing fundamentally impossible about it. Rather than encoding it as There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See |
||
| `throw` | `0x08` | |Throws the exception on top of the stack | | ||
| `except` | `0x09` | tag : varuint32 | Creates an exception defined by the exception tag and pushes reference on stack | | ||
| `if_except` | `0x0a` | sig : `block_type` , tag : `varuint32` | Begin exception data extraction if exception on stack was created using the corresponding exception tag | | ||
|
||
The *sig* fields of `block`, `if`, `try` and `if_except` operators are block | ||
signatures which describe their use of the operand stack. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this meant to supersede the existing exceptions proposal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The relation between the two docs is not clear to me either. Can this be clarified somehow?