-
Notifications
You must be signed in to change notification settings - Fork 695
Expressionless encoding #669
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Can this be summarized as moving from an expression AST to a register machine + prediction for register values? I can sort of see how prediction could help with the added entropy of a register-based bytecode (predict the next use, and if you are right, don't write the number; and running the same deterministic prediction on client and server guarantees the same stream of registers). However, that adds complexity to the spec. And I'm not sure how good prediction could be - I'd be surprised if it's right even half the time. The expression AST has a guarantee, on the other hand. Overall this seems like the kind of thing we should have considered a long time ago - not that we can't consider it now, but if we do, it means substantially delaying shipping. So this would require especially strong justification, I think. |
I for one do not want an expressionless encoding. If there are some substantial technical advantages to an expressionless encoding, I wouldn't blame the CG for adopting it, but the ease of comprehension of Wasm is something I have come to really appreciate - not just as a format for pretty printing (as mentioned, a transform could produce an expression tree for that purpose) but as something I look forward to working with at a low level, because it is simple to comprehend in its "native" form. C has sometimes been called a "high level assembly language", but Wasm fits that description much better. It's just really nice. |
@qwertie The 'expression' support that you get in wasm might not be much chop for 'ease of comprehension' anyway and very limited in scope, which I why I have given up on it. To present it for 'ease of comprehension' I am having to first decompose it into a more primitive form and then rebuild expressions, so the current expressions are just a burden and baggage and also have a good deal of implementation cost too. I expect some producers might have an extra burden, and might that be what you are concerned about? An algorithm for expanding an expression into statements that operate on registers is almost as simple as walking the AST, and will be very natural with the post-order encoding. Could a relatively simple library that transforms from the expression format to the expressionless format address this? Doing so seems consistent with the goals of wasm - that is moving complexity out of the consumer and to the producer. I want a good 'high level assembly language' for wasm too, but the wasm code section encoding is not it and never will be. I have tired and tried, I see the barriers, it's not going to happen. Thankfully the ignored-sections are part of wasm and we can build this language there and we can explore many different versions and continue development largely independent of the wasm code section. It could include 'let' variables, multiple value support, discarding of unconsumed values, etc. |
@kripken Yes, 'register machine + prediction for register values' is a good summary. Can you find an example where the predictor would not have a 100% hit rate for a wasm AST expression expanded to register instructions? I see a hit rate of around 50% for AngryBots for all variable accesses not just those for expression temps, and around 25% of the instruction instances have a hit for all operands. These operators are just bundles of If I get time I'll implement a decoder in a web browser to give a practical example. Is the v8 wasm decoder the benchmark? Is the Firefox SM decoder competitive and useful for a demo? |
No, I'm talking about the cognitive ease of comprehending an expression tree versus a flat instruction list.
What barriers? Barriers to what? (note that these are two separate questions.) |
@qwertie I agree that expressions are better for the reader, but the wasm encoding is a binary code and the opcodes are not expressions and with the post-order encoding could be viewed as stack machine operations. There is even discussion of adding a I am not even sure if the expressions help the producer. The llvm backend has to work hard to fit the expression pattern. With the register base opcodes the encoding efficiency should be higher where a producer does not want to go to such lengths to shoe horn its output into restricted expression patterns. The 'barriers' are getting the browser vendors on board and in agreements. My experience is extreme resistance to added complexity, and that there are a lot of differences of opinion. The expressionless encoding proposal removes complexity, and fits a plan to develop higher level decoders externally from the wasm code encoding where agreement is not necessary and different projects can explore different presentation languages. |
Yes, you can get all the expression registers right, but each instruction's use of a register needs to either use a predicted register, or use a user-specific one. That's extra information that in the AST approach is not present. Or, if you find a way to encode things without that extra information, you'd be isomophic to an AST anyhow. Overall, we did experiments early on (mentioned in the FAQ, etc.) on ASTs vs other approaches like a register machine, and ASTs were smaller.
I don't know, but my point was that a major change now would require redoing a lot of engineering that's already been done, which would significantly delay the ship date, whenever it is. I just don't think it's realistic to expect that to happen here, given that it's something that we've already carefully evaluated before. Unless we really missed something big back then, which is certainly possible, and if so we should certainly be willing to delay things - but I just think that is very unlikely. |
@kripken So we agree that if the registers are allocated appropriately for code that fits the current AST expression pattern then the prediction would be perfect. The proposal does allow an immediate reference to a local variable that might be a prediction miss but that is a feature that does not need to be used, for example an explicit The prior results comparing a register machine probably did not have the predictor, but even without the predictor the uncompressed sizes are similar! Without putting in some development work to optimize different approaches the results of comparisons might not be useful, for example some opcodes accepting immediate constants were very helpful for the register machine. The register machine seems to have a lot of advantages, and even if it were not as efficient at encoding it might worth considering. Even if wasm were to keep the expression support, it can be argued that it needs the register opcode variations anyway to efficiently encode patterns that do not fit the restricted expression patterns. For example the proposed multi-value support which writes results of call operators to local variables, and passing around multiple values, and code with multiple uses of definitions. But if wasm has these register opcodes then there seems to be a good argument that it does not even need the expression support, that it could be simplified. So wasm could ship with the AST, then add then new encodings, then deprecate the expression support. It does not appear a big engineering task if people really wanted to go with this path, and I have had a look at v8 and SM now. I've demonstrated AngryBots converted into the register-machine format but using explicit If I can demonstrate a more compact encoding, a simpler and faster decoder, better or simpler unoptimized code execution, then it would seem to make a good case that 'we really missed something big back then'. I can only try. Worst case if this proposal is not adopted: tools need to unpack/pack AST expressions as part of the encoding compression (they would want to do something similar anyway to exploit prediction); and have perhaps at most 30% worse encoding efficiency and slower decoding for code that does not fit this pattern well or does not want to bother; and the runtime decoders are more complex; and the spec is more complex. |
In case it helps with the ongoing discussion, I see a potentially significant problem with this expressionless encoding, namely the number of local variables generally increases and this increases the storage and processing requirements of the SSA decoder. For example, the v8 SSA decoder maintains a fixed vector of the local variables for SSA analysis and at each branch splits this creating a copy so that when they merge the differences can be noted to assign Phis. It is easy to see that as the number of local variables increases then copying and merging is going to increase linearly. From this perspective, the AST encoding can be seen to move the expression intermediate values out of the SSA environment and thus reduce the amount of work for the SSA decoding. With the post-order encoding, the AST can be seen as a stack machine, but the restrictions ensure that the values on the stack can never be split or merged in the control flow. Values on this stack at the entry to a block are not usable in the block (unless they were written to a local?) and are still there on exit, and they are discarded if there is a break out of the block. All values pushed within a block are popped on any exit from the block. These restrictions on this stack machine seem sufficient to ensure that the values on the stack need not be part of the SSA environment, would people agree? The exit values of a block, the value for a break or the fall-through, are part of the SSA environment from what I understand. In v8 these seem to be the Shall explore the cost of the extra local variables, and it might be mitigated by having effective stack pointers which limited the number of active local variables that need to be considered as part of the SSA environment. But it also suggests another look as the AST. Have not got my head around it all, but could we relax some of the AST stack restrictions while still ensuring that the stack values were not part of the SSA environment? For example, could it be possible to read any of them rather than just popping a value? Any chance that an SSA expert could suggest the absolute minimum restrictions on a stack machine that would still ensure that the values on the stack were not part of the SSA environment? If some restrictions can be relaxed then could these be considered? |
It appears best to continue making the case for multiple value expressions and effective block scoped single-assignment variables. One of the objections to block scoped local variables was that it would complicate the SSA local variable environment and that seemed a show stopper, but it seems entirely possible to eliminate local variables while still having effective single-assignment block scope variables (which is what expression intermediate values are). Getting past that blocker opens a range of alternative encodings and different trade off that might yield a simpler and faster SSA decoder and still an efficient encoding. |
For the discussion of the design and pros and cons of an expressionless encoding. I'll update this opening comment with the current proposal and rationale.
The proposal:
nop
. For example:(set_local $r1 (i32.add (get_local $v1) (get_local $v2)))
becomes(i32.add $r1 $v1 $v2)
.(i32.add/lc $r1 $v1 10)
.select
andset_local
become type specific. Thus all local variable indexes are type specific and use a type specific index space which helps compression and simplifies a prediction algorithm by avoiding special consider for these operators.if
operator might no longer be as useful and might be removed, or kept for efficiently encoding diamond control flow patterns, but might be subsumed by an alternative more efficient encoding ofblock
, and one suggestion is to replaceif
byunless
Replace theif
operator by anunless
operator. #676.return
operator. The proposedend
operator might want to accept arguments.return
operator reading a local variable for each returned value.i32.add
i64.add
f32.add
andf64.add
based on the last variable written.Rationale:
For example the follow expression converted to expressionless statements would have a perfect prediction hit rate:
set_local
would be the predicted to be the next value read, so that pattern is optimized well, and changing the order of the arguments to the store operators would fit a simpler predictor.get_local
/operator/set_local
, bundling the local variable indexes into the operator allow their type to be known when decoding which supports using type specific variable index spaces and avoids redundant types on immediate constants when they are also bundled.block1
can push it a little further. Operators could be added to produce any number and order of values, but the options to consume the values are still very limited to single use and: destructuring, pass-through parametric operators, result values, arguments to a call. A type system is need to validate these operators. It seems like a lot of language development work and a lot of potential bike-shedding, and the product might become rather specific and baggage from other perspectives and presentation forms.get_local
for each constant use. If it helped runtime compilers then these might be declarable at the start of the function and read-only which might allow a compiler to put them in a constant pool - that is constant pool references sharing the index space with local variables.if
just blocks and breaks and without a predictor (which may not change the uncompressed size if just using a zero index for a hit). Using prediction and specialized opcodes for predicted operands reduces the file size to 80% of the original size.0xbb
to enable this alternative mode for operators, and per function. The uncompressed file size is 113% of the original. This does not yet implement opcodes with immediate constants which improves the encoding efficiency or the prediction model.The text was updated successfully, but these errors were encountered: