-
Notifications
You must be signed in to change notification settings - Fork 695
Make WebAssembly more like assembly #299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This requires non-nullary macros to optimize. We haven't proven that we can even do those effectively without a performance hit, nor have we proven that they will eliminate all the size overhead in cases like this. If statements are common enough that I'd be concerned about how much performance we'd be leaving on the table if we replace the if-statement AST node with this. I find this kinda gross but I'd be OK with it as an alternative alongside the if-statement node. I would prefer that we not merge the block and loop concepts. I think it would be better to keep the concept of a |
For the perspective from which the current control constructs may also be considered ugly, here's an illustration of how the four simple and orthogonal control transfers proposed here, {conditional,unconditional} branch {forward,backward}, map onto the current opcode set:
|
Is it fair to say that the benefits here are
and the downsides are
? |
I'm not sure how this avoids the relooper algorithm. E.g. even something On Mon, Aug 17, 2015 at 7:52 PM, Alon Zakai [email protected]
|
@kripken - benefits also include:
@titzer - It requires knowing where the loops are, and it requires that loops be single-entry. It does not require dominator trees. The algorithm roughly goes like this: Order the blocks in RPO. Identify the loops, and assign the backedges conditional and unconditional branches as appropriate. All remaining CFG edges are now forward branches. The scope for each branch can be placed at the beginning of the innermost loop which contains the branch destination. If multiple scopes begin at the same place, sort them in reverse order of the order the destinations in the RPO. |
@kg This non-nullary macro compression problem is potentially easier than the general case; we could create predefined macros for things like diamonds and triangles, rather than requiring a tool to automatically discover the macros on its own. |
I don't understand the debuggability improvements. Are you saying it's more obvious where the next step in the debugger will go, if there is a goto-like thing instead of an if? Surely an if is better? In your examples, you put everything on one line. I agree that such code is hard to debug, but when code is properly pretty-printed, ifs, loops, continues and breaks are surely the simplest thing for people to debug with? It's what they write in the source code, after all... |
I'm inclined to ignore binary encoding size considerations at this stage for this particular issue and focus on which ops have the ideal semantics for compilation to/from wasm, debugging, etc; at least in polyfill-prototype-1, bytes spent on control flow represent only about ~12% of total serialized bytes and I'm bullish on non-nullary macros anyhow. |
For what it's worth, this maps more closely with our IR format so it makes me happy. |
I think this is an interesting compromise between
Overall, I agree that structured control flow has downsides. The main one is that it requires more work on the compiler emitting wasm. I do see that as a worthwhile sacrifice, though - ensuring a smaller download, better debuggability, etc. are all more important than making things a little easier for compilers, especially since we are even going all the way and writing a full compiler ourselves that people can use (or take code from). And while plain |
@kripken The first-order reasons to do structured control flow have to do with engine impl (e.g.), binary size and the polyfill. The lower-level control flow primitives proposed here preserve all those benefits. As I've argued in other bugs, I don't think we should allow the text format to influence core semantics; providing a nice pleasant source-code-reading experience in devtools should be the job of improved source maps. If you're stepping through raw wasm, you're going to be having an assembly experience anyways (loads from pointers + offsets) so I don't think the presence of IIUC, this proposal is just a slight tweak to what we currently have based on the realization that |
I agree with most of your first paragraph, but I think we just diverge on the issue of code size. To me, that this increases code size in a way that must be offset elsewhere is a downside. I guess to you, that is acceptable, as you said earlier. I just don't see making compiler's lives slightly easier in return for an increase in code size as worthwhile, given the "easier" here is so small - it doesn't even enable us to avoid the relooper (which, if that were possible, is a larger benefit, and |
Actually, talking about this earlier with @sunfishcode, it seemed like code size could go either way: if the new relooper strategy was generating a ton of That all being said, I'm still not 100% sure these opcodes are really so much better from a semantic perspective; definitely interested to hear more on this line of discussion. |
Yeah, part of my objection to this proposal is that the upside isn't particularly clear to me and they feel like an awkward midpoint between fully structured control flow (if, loops, etc) and gotos. As a compiler author I'm not sure how I would benefit from this compared to either alternative. If we're concerned with optimizing the relooper, shouldn't we just design one or more opcodes for that? Then we don't sacrifice the space efficiency/decode complexity of the (vastly more common) traditional ifs/loops and we don't compromise reloopered code either. Separate opcodes also make it easier for a runtime to identify 'complex' functions which I think is valuable in terms of simpler validation (plus, interpreters/naive compilers/etc will have an easier time). Also, to be clear, when I say 'performance hit' for non-nullary macros, I mean both file size and decode efficiency. We've been very concerned about decode performance in the past and something like this could increase decode/validation cost to a not-insignificant degree, since loops and branches are so common. So that's part of why I'm wary. |
I like the simplicity of this proposal - and I especially like it more than the previous text, because the previous text does not say that you can break or continue from a nested loop to an outer loop. But it sounds like we can't get a consensus without more prototyping & data. Possible compromise: predefine non-nullary macros |
@kg:
I sketched out an algorithm for converting a CFG to structured form above, and it's simpler than the Relooper algorithm. I didn't cover how to reduce multiple-entry loops or deal with switches, but I believe it's straight-forward (ignoring the need for indirect branching, which neither approach handles well).
On space efficiency, there is an assumption of a non-nullary macro layer, because it's useful for other things as well. If this is not true, the outlook would be different. The proposal here reduces decode complexity because it makes the control constructs that must be handled much simpler. I expect interpreters naive compilers to be among the benefactors of this proposal. I'm not clear on the meaning of identifying 'complex' functions.
It's true that I don't have data here, and won't be able to until a lot more infrastructure is in place. However, another plausible perspective is that it's a historical accident that WebAssembly started with if-else and do-while as the baseline in the first place, and that perhaps WebAssembly should instead start with the simpler operators in this proposal, and add if-else and do-while later based on demonstrated need. I'm open to discussion.
There's no difference in expressiveness. We could translate from either form to the other losslessly. What specifically are you interested in? @qwertie: |
I'm not sure why we're spending opcodes on not-equal and all combinations On Tue, Aug 18, 2015 at 6:17 PM, Dan Gohman [email protected]
|
@sunfishcode I wasn't talking about expressiveness, just the subjective: is this a more natural set of primitives to generate or consume (which of course what makes this a hard issue to come to any definitive conclusion on). |
@titzer It's a fair point, and I don't deny there's subjectivity involved. Not-equal operators improve readability while adding very little burden to producers or consumers. If-else and do-while improve readability in many cases but also harm it in some, are a moderate additional burden to CFG-based producers and simple consumers. I think the subjective part of the question is how we want the language to feel. If WebAssembly is a machine language, rules like what |
I've been going back and forth on this very question in my own VM for uscript. I've implemented at least a dozen VM interpreters in the last few weeks and for the life of me can't decide between structured programming, goto, or something in between. Personally I would be sad if web assembly was hard to write by hand as a human. In my experiments, I have yet to see any real performance benefit from choosing a machine style over a higher-level expression-tree style with structured logic. In fact, sometimes the higher-level constructs are much faster, but that's likely because I'm writing a pure interpreter. As I understand it, web-assembly will often use AOT or JIT compilation in production which changes the costs dramatically. |
Hand-writing wasm applications is definitely not high on our list of concerns. As much as we might like to make those use cases nice, it's not on the list of priorities, so we can't optimize for it at the expense of anything else. Wasm modules will be produced by compilers. Naturally, we don't want to assume a single particular compiler. In some cases a wasm module will be produced from another wasm module via a tool, like one that generates instrumentation, adds validation code, obfuscates it, or strips unused information. At some point user-mode JITs will use a subset of the wasm module format and IR to generate code at runtime, as well, but that's not a direct concern right now. |
The more I think about this proposal, the more it seems like just adding On Tue, Aug 18, 2015 at 9:34 PM, Katelyn Gadd [email protected]
|
@titzer Indeed; This PR isn't about altering what's possible, or about seeking a minimal basis. It's mainly about how people think about and use wasm. |
607a1c8
to
90d7f7f
Compare
An observation made recently is that there are important use cases where it's desirable to process WebAssembly code in a streaming fashion, and insert code into the stream. For example, several of the JIT library use cases involve this. Branches with either absolute or relative offsets would make this awkward because they require patching if code moves around. The hybrid proposal here preserve the ability to have branches merely reference a nesting level avoids this problem. One can easily insert code without the need to update any branches. |
@JSStats I expect loop+switch will be rare in practice with the proposal in this PR. In particular, I don't foresee it becoming common to use as a general way to translate a CFG to wasm. If you show me code you think will need it, I can show you what we might do for it. If you are advocating instead for a CFG representation that would be used commonly, I consider that a separate purpose. |
I'm in favor of moving in this direction in general. In particular I always Taking this a step further (and probably outside the scope of this PR), I
On Wed, Sep 23, 2015 at 9:41 PM Dan Gohman [email protected] wrote:
|
I think we should avoid biased assumptions about the predominance of https://github.com/jashkenas/coffeescript/wiki/list-of-languages-that-compile-to-js For many of the compilers in this list (especially those further down the But to get there, we shouldn't make it harder than necessary to write Turning Wasm into a CFG format has minor benefits for CFG-based compilers. But in return, you would be making the life of a whole range of other, more /Andreas On 24 September 2015 at 18:07, Derek Schuff [email protected]
|
There are advantages to structuring the control flow with "if" and "block" As for this concrete proposal, we've already seen that the conditional I have no problem with LLVM generating "if (cond) break" instead of doing So why do we need to remove expressive power right now? On Thu, Sep 24, 2015 at 6:55 PM, rossberg-chromium <[email protected]
|
Lowering if and if-else to the constructs in this proposal is not difficult, even for trivial single-pass compilers. There's not even any backpatching required. The proposal here follows a stack discipline; blocks can be discarded when the stack is popped. Join points are easily identifiable. AST-based SSA construction works with this proposal in the same way that it already works in JS engines with labeled break and continue. The proposal also clearly identifies loops (nice for optimizers and humans). Inlining still is syntactic substitution. People interested in general CFGs instead are encouraged to submit a separate proposal so we can discuss it separately. The proposal here does not remove expressive power. It also does not add any. It does: simplify the model of control transfers, and the relationship between a producer's control flow, wasm, and the control flow of an engine. From my experience with Emscripten and asm.js, the more times control flow is translated (source -> LLVM CFG -> asm.js AST -> OdinMonkey CFG -> machine code), the harder it is to follow correspondences through the system when working system-wide. Humans are important. Ideally, we should enable humans writing high-level language code to debug their code using high-level-language debuggers. Stepping down to wasm should ideally be for low-level concerns, where eg. single-step debugging is a common activity, and it's best if every instruction does one thing. In the proposal here, the special Since
The list grows if we consider break vs continue, x vs !x, loop vs forever vs do_while, switch cases not having fallthrough or requiring default to go at the end, or combinations thereof. In contrast, the proposal here has:
The list grows if we consider block vs loop. I find arithmetic operators easier to reason about than control transfers because they have a more localized effect (the only non-local behavior is trapping). Consequently, I find greater appeal in reducing the control opcode set than the arithmetic opcode set. In this proposal, the control operators achieve a level of locality which is not far from that of the arithmetic operators. |
Then why shouldn't I emit my |
Because looking at it system-wide (the context for this quote), your code will get lowered to a CFG-like form eventually anyway. |
Unless it's being decoded by a runtime that doesn't use a CFG, or the polyfill is converting it to javascript if statements, or it's being run by an interpreter, or being manipulated by developer tools... I totally get the premise here that LLVM happens to use a CFG, and OdinMonkey uses a CFG. That doesn't surprise me. Making it easy for LLVM and emscripten to encode a CFG also makes sense. I would be happy with some mechanisms being added to enable expressing CFGs, as a replacement for things like loop+switch. But I can't get on board with removing primitives like if statements. FWIW, most code JSIL encounters is best expressed as if statements and loop blocks. Occasionally it does need something CFG-like (code that used goto, mostly) and in that case I have to use loop-switch. So for that subset of code I'd love some sort of CFG mechanism, but if I had to get rid of if statements and loop blocks that would be a pretty steep price to pay. I wouldn't be surprised if the same was true for some other code generators. Does the GWT team want CFG? What about the Unity team? How about People authoring compilers for native languages like Rust, Go, etc? Does everyone want CFG and CFG only? I don't get the impression that we actually know this for a fact. |
On Thu, Sep 24, 2015 at 11:32 PM, Dan Gohman [email protected]
|
I interpret it to be within the scope of the high-level goal to "Define a [...] format to serve as a compilation target". But it's nice for other low-level purposes too.
What I mean is, the combination will want to be recognized as if it were a distinct construct to avoid thinking about branch-past-a-branch.
Consider it as temporal locality; "do one thing and complete" vs "do one thing, stick around, then do something else later".
My proposal happens to be great for simple compilers that want to translate straight from WebAssembly to native assembly.
My proposal can be translated to javascript using the
People interested in interpreting wasm directly like my proposal.
Debuggers are simpler with my proposal. Injecting code into a wasm stream remains simple. Inlining remains syntactic substitution. Are there specific things you have in mind?
I agree, and the current design shares this limitation. I'm interested in ways we can address it, but it's a separate topic.
Yes, I actually do know that Unity, Rust, and Go would be ok with this. I also know that numerous other native compilers would be ok with this. I don't know about GWT. However, I'll recall my assertion above that lowering high-level constructs to the constructs in this proposal is not difficult (people do this quite a lot), even for trivial single-pass compilers, so I don't expect GWT would have difficulty. |
I feel like we're talking in circles here. I must have failed to communicate my concern here: My concern is not 'we can't implement this'; we're all programmers here, we know it's possible to implement this. My concern is that it makes things gross and awkward for the average user. Being able to translate it to javascript with One of the advantages of having an AST with nodes like Debugging asm.js applications is enough of a nightmare to begin with. Please, please, please, we should not make wasm even worse. |
Making wasm close to the source is definitely not a stated goal (high-level or otherwise) of wasm. The path to supporting source-level reasoning about wasm code is already in Tooling.md: debug info (improved source maps). The goal of wasm is to be a good compiler target and that is the context in which we should consider this proposal. In particular, one common argument we make is that there should be a relatively obvious mapping from wasm ops to machine code; I think |
I agree with @lukewagner that
This lets a producer provide additional information to the consumer by using the more constrained control flow operations, or to generate a CFG node when necessary or convenient. |
Dan Gohmann:
I agree that the issue isn't losing expressiveness. But it is losing Luke Wagner:
Being close to machine code is not a value in itself, though. When CPU |
You're right; I think I should have instead stated that the goal is predictability of good codegen. As already argued, |
When Sun designed Java, they were also freed from the constraints of direct hardware execution. They chose a CFG. Java's CFG causes some headaches (for VMs), but when Microsoft designed CIL they had the benefit of learning from Java. They also chose a CFG. Both Java bytecode and CIL are higher-level than WebAssembly, and yet it wasn't important to their creators to preserve source-level control constructs in their bytecodes. The typical human programmer does not see Java or CIL bytecode disassembled on their screen during normal development. Java and C# debuggers don't show users disassembled bytecode (by default); they show the source code. Obviously users sometimes do encounter bytecode, such as when porting or interfacing different languages. However, while people on the internet are not shy about complaining about aspects of Java bytecode or CIL that make porting new languages to those systems difficult, the lack of source-level-like control structures in the bytecode not something that people mention. I am struggling to reconcile these circumstances with this conversation. |
I hope we can reach consensus on this point, because I think there is an I don't see a problem with accepting some "sugar" forms of control like But I think it's going backwards to remove the structured "if" and "loop" On Fri, Sep 25, 2015 at 6:24 PM, Dan Gohman [email protected]
|
Ok, so then could the resolution of this issue be: add |
It's noteworthy that under this resolution, the new opcodes would just be
syntactic sugar, and don't require an extension to the core spec.
|
On Tue, Sep 29, 2015 at 3:23 PM, Luke Wagner [email protected]
|
To dispose of the other changes in this PR:
We can continue discussion of 1 further in #322, and 5 in #310. 2 seems to have pretty strong opposition. Does anybody object to 3? What about 4 and 6? I think 4 is good. We should either do 6 or add |
While Java bytecode was designed to be interpreted, this doesn't appear to be the case for CIL (here, here). CIL even makes direct interpretation difficult for example by making simple instructions like
I'm very interested in supporting people working from high-level ASTs, and I expect they'll want many high-level features that WebAssembly isn't likely to support directly, which is why I'm promoting the idea of a JIT library. Among other higher-level features, such a libary could also easily provide even higher-level control constructs like |
Agree more with @rossberg-chromium. For the JIT Library that
Yes please, that's what makes WebAssembly so interesting 👍 |
Closing and following up in #403. |
This replaces high-level control structures with low-level primitive operations.
break
is generalized to a "branch forward" andcontinue
becomes a generalized "branch backward". Along withswitch
, this is sufficient to represent all control transfers, making specialized constructs redundant. This implements the idea at the end of #261.This proposal also adds
select
operators, which serve the role of "conditional move" instructions in common ISAs.For example, code like this:
into this (invented syntax):
This design is very close to
goto
, however unlikegoto
it inherently preserves the well-structured property, so backends can rely on it without verification.This design would make it natural for CFG-based compilers to target WebAssembly without employing the Relooper algorithm. They'd still need to ensure that loops are single-entry, but given that, it is trivial to translate arbitrary control flow into the constructs proposed here.
This design neither depends on nor conflicts with statements-as-expressions. It works either way.
Big-picture questions:
compression ought to be able to erase the difference. Is this a
problem?
(but not essential) properties for both humans and compilers. Is it
worth giving up the dedicated opcode to achieve a simpler opcode set?
This is partially compensated for by adding
select
, though on theother hand
select
is technically redundant too.Small-picture questions:
block
andloop
be merged?break
andbreak_if
be merged? (andcontinue
andcontinue_if
)?