-
Notifications
You must be signed in to change notification settings - Fork 695
Factor block_cascade out of tableswitch #480
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Regular labels are only in scope inside the expression they name. Case labels are very different beasts, they are only known to the respective tableswitch statement enclosing them. Your suggestion would require reconciling these two notions somehow, and allow referencing expression labels from the outside. It is not clear what the rules for this should be without losing well-structuredness of control flow. Even if we found such rules, I'm pretty sure they'd be more complicated and costly to check than the current design, which guarantees well-structuredness by construction. |
That difference between a tableswitch's cases and a cascade of blocks is what makes it harder to transform the latter to the former. The goal here is that the block_cascade has the same semantics as a cascade of nested blocks in terms of label visibility, so it maintains the structured control flow property. The difference is that it doesn't reflect that nesting in the syntax structure. |
@rossberg-chromium The rule for this to preserve well-structuredness of control flow is: these labels can only be referenced from within the block_cascade above their definition. This can be proven by examining the effective rules that would be imposed by the equivalent stack of nested blocks. I can confirm that CFG-based producers would find block_cascade convenient to use. On the consumer side, I imagine we could design the binary enoding such that the opening of a block_cascade declares how many labels it has, conceptually pushing that many entries on the control flow stack, and each label definition pops its entry. That way, references to labels can use the same "depth in the control flow stack" concept that we already have, so this seems like it should be easy to implement. My main reservation with this feature is that while it's less convenient for producers to run an algorithm to do block placement, it can sometimes produce code with lower maximum stack depths (eg. |
I think this is complementary to block_cascade if you produce blocks the same as you would otherwise, and then do an optimization pass that collapses the cascading block pattern to a single node. |
I'd like to strongly encourage people to think carefully about the impact that significant tree restructurings like this will have on executable size. |
@sunfishcode, right, makes sense. But it seems you could break this up into two independent proposals:
|
@rossberg-chromium When I was proposing a pure indexed break earlier, one of the complications that was brought up is that tableswitches can sometimes have thousands or more targets, and stacking up blocks to that level of nesting might create stack overflow trouble for recursive parsers. Defining a block_cascade at the same time would alleviate this. (A related concern is the asm.js polyfill; existing JS parsers don't expect super-deep nesting, however I wonder if in the polyfill timeframe, we can make a convention that if you want to use the polyfill, you only using block_cascade with tableswitch, and then always putting the tableswitch at the very top of it. That might make it reasonable for the polyfill to detect the pattern and emit a switch. I'm confident I can make LLVM emit code like this, I'll talk with others about polyfill side.) Concerning code size, tableswitch is very infrequent relative to other operators, so it doesn't contribute significantly to code size no matter what we do here. Finally, @AndrewScheidecker or anyone else, what do you think of the name |
I like the name But regarding code size, if |
|
Agreed. It might also be faster to parse if the |
|
I suggest if we introduce a general CFG node, we call it I've now submitted #490 to make an initial attempt at wording for |
I think we're going to significantly simply |
This idea isn't fully baked, but I thought I'd kick it out here and see what people think.
There are two components to
tableswitch
: a branch to a target taken from a table+index, and a set of cases that introduce special targets that may be referenced by the table. As I understand it, the purpose for the second component is to avoid highly nested cascades of AST nodes that would otherwise be necessary to represent the a C-styleswitch
with many cases.I think the second component could be split into a separate AST node:
block_cascade
. It would look kind of like this:Which could be desugared to this:
One benefit to this is that it would be much easier to produce
block_cascade
by pattern matching than atableswitch
with cases: it can ignore the contents of the A, B, C, D syntactic variables above. However, some of that benefit could be captured by making the existing tableswitch cases desugar into the block cascade so it doesn't need to worry about whether those blocks-to-be-case-ified are used as generate branch targets.The text was updated successfully, but these errors were encountered: