Factor block_cascade out of tableswitch #480

AndrewScheidecker · 2015-11-25T13:23:38Z

This idea isn't fully baked, but I thought I'd kick it out here and see what people think.

There are two components to tableswitch: a branch to a target taken from a table+index, and a set of cases that introduce special targets that may be referenced by the table. As I understand it, the purpose for the second component is to avoid highly nested cascades of AST nodes that would otherwise be necessary to represent the a C-style switch with many cases.

I think the second component could be split into a separate AST node: block_cascade. It would look kind of like this:

(block_cascade
    (tableswitch Index... (table #case0 #case1 #case2) #default)
    #case0
    A...
    #case1
    B...
    #case2
    C...
    #default
    D...
)

Which could be desugared to this:

(block #default
    (block #case2
        (block #case1
            (block #case0
                (tableswitch Index... (table #case0 #case1 #case2) #default)
            )
            A...
        )
        B...
    )
    C...
)
D...

One benefit to this is that it would be much easier to produce block_cascade by pattern matching than a tableswitch with cases: it can ignore the contents of the A, B, C, D syntactic variables above. However, some of that benefit could be captured by making the existing tableswitch cases desugar into the block cascade so it doesn't need to worry about whether those blocks-to-be-case-ified are used as generate branch targets.

The text was updated successfully, but these errors were encountered:

rossberg · 2015-11-25T13:47:02Z

Regular labels are only in scope inside the expression they name. Case labels are very different beasts, they are only known to the respective tableswitch statement enclosing them. Your suggestion would require reconciling these two notions somehow, and allow referencing expression labels from the outside. It is not clear what the rules for this should be without losing well-structuredness of control flow. Even if we found such rules, I'm pretty sure they'd be more complicated and costly to check than the current design, which guarantees well-structuredness by construction.

AndrewScheidecker · 2015-11-25T13:53:12Z

That difference between a tableswitch's cases and a cascade of blocks is what makes it harder to transform the latter to the former.

The goal here is that the block_cascade has the same semantics as a cascade of nested blocks in terms of label visibility, so it maintains the structured control flow property. The difference is that it doesn't reflect that nesting in the syntax structure.

sunfishcode · 2015-11-25T15:41:16Z

@rossberg-chromium The rule for this to preserve well-structuredness of control flow is: these labels can only be referenced from within the block_cascade above their definition. This can be proven by examining the effective rules that would be imposed by the equivalent stack of nested blocks.

I can confirm that CFG-based producers would find block_cascade convenient to use.

On the consumer side, I imagine we could design the binary enoding such that the opening of a block_cascade declares how many labels it has, conceptually pushing that many entries on the control flow stack, and each label definition pops its entry. That way, references to labels can use the same "depth in the control flow stack" concept that we already have, so this seems like it should be easy to implement.

My main reservation with this feature is that while it's less convenient for producers to run an algorithm to do block placement, it can sometimes produce code with lower maximum stack depths (eg. { { } { } } has a lower max depth than { { { } } }), but I don't yet know how much weight to place on this.

AndrewScheidecker · 2015-11-25T18:07:49Z

My main reservation with this feature is that while it's less convenient for producers to run an algorithm to do block placement, it can sometimes produce code with lower maximum stack depths

I think this is complementary to block_cascade if you produce blocks the same as you would otherwise, and then do an optimization pass that collapses the cascading block pattern to a single node.

kg · 2015-11-25T22:44:15Z

I'd like to strongly encourage people to think carefully about the impact that significant tree restructurings like this will have on executable size.

rossberg · 2015-11-26T13:55:52Z

@sunfishcode, right, makes sense.

But it seems you could break this up into two independent proposals:

Simplify tableswitch to become just an indexed break. I kind of like that from a semantic perspective, as it is a simplification and eliminates the special status of case labels.
Introduce a shorthand for nested blocks. The benefit of that is less clear to me. AFAICT, the difference shouldn't matter much for either producers or consumers (@sunfishcode, wouldn't you need to manage some form of stack of similar size either way?). Is the impact on binary code size significant?

sunfishcode · 2015-11-28T22:31:26Z

@rossberg-chromium When I was proposing a pure indexed break earlier, one of the complications that was brought up is that tableswitches can sometimes have thousands or more targets, and stacking up blocks to that level of nesting might create stack overflow trouble for recursive parsers. Defining a block_cascade at the same time would alleviate this.

(A related concern is the asm.js polyfill; existing JS parsers don't expect super-deep nesting, however I wonder if in the polyfill timeframe, we can make a convention that if you want to use the polyfill, you only using block_cascade with tableswitch, and then always putting the tableswitch at the very top of it. That might make it reasonable for the polyfill to detect the pattern and emit a switch. I'm confident I can make LLVM emit code like this, I'll talk with others about polyfill side.)

Concerning code size, tableswitch is very infrequent relative to other operators, so it doesn't contribute significantly to code size no matter what we do here.

Finally, @AndrewScheidecker or anyone else, what do you think of the name multiblock for this construct?

kripken · 2015-11-28T22:49:35Z

I like the name multiblock.

But regarding code size, if block_cascade/multiblock is factored out of tableswitch, then it might be used more often than tableswitch, and the impact on code size is unclear. Or if the intention is that the factored-out construct never be used outside of a tableswitch, then obviously code size is not impacted exactly as you said (since tableswitch is rare), but then I'm not sure I see the benefit of factoring it out.

sunfishcode · 2015-11-28T23:56:27Z

block_cascade/multiblock could easily be smaller than an equivalent stack of nested blocks. It obviously depends on encoding strategy, but block_cascade/multiblock might use single-byte markers to mark where the labels go, making it be just 1 byte per label, while the per-label size of nested blocks might often be greater than 1, because they need an opcode plus a count of how many statements there are, or a marker for where the block ends.

kripken · 2015-11-29T00:11:45Z

Agreed. It might also be faster to parse if the multiblock is large (less branching).

AndrewScheidecker · 2015-11-29T23:55:44Z

multiblock is ok, but could be ambiguous if there is also a more general CFG node (which I imagine being very similar but with less constrained label scoping).

sunfishcode · 2015-12-03T03:53:40Z

I suggest if we introduce a general CFG node, we call it cfg or graph or similar, to emphasize the arbitrary graph nature. To me, multiblock suggests multiple blocks grouped together, which is essentially what it is.

I've now submitted #490 to make an initial attempt at wording for multiblock.

titzer · 2016-02-25T01:06:59Z

I think we're going to significantly simply tableswitch to br_table, which obsoletes this issue.

sunfishcode added the control flow label Nov 28, 2015

sunfishcode added this to the MVP milestone Dec 2, 2015

sunfishcode mentioned this issue Dec 3, 2015

Factor a multiblock node out of tableswitch. #490

Closed

titzer closed this as completed Feb 25, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Factor block_cascade out of tableswitch #480

Factor block_cascade out of tableswitch #480

AndrewScheidecker commented Nov 25, 2015

rossberg commented Nov 25, 2015

AndrewScheidecker commented Nov 25, 2015

sunfishcode commented Nov 25, 2015

AndrewScheidecker commented Nov 25, 2015

kg commented Nov 25, 2015

rossberg commented Nov 26, 2015

sunfishcode commented Nov 28, 2015

kripken commented Nov 28, 2015

sunfishcode commented Nov 28, 2015

kripken commented Nov 29, 2015

AndrewScheidecker commented Nov 29, 2015

sunfishcode commented Dec 3, 2015

titzer commented Feb 25, 2016

Factor block_cascade out of tableswitch #480

Factor block_cascade out of tableswitch #480

Comments

AndrewScheidecker commented Nov 25, 2015

rossberg commented Nov 25, 2015

AndrewScheidecker commented Nov 25, 2015

sunfishcode commented Nov 25, 2015

AndrewScheidecker commented Nov 25, 2015

kg commented Nov 25, 2015

rossberg commented Nov 26, 2015

sunfishcode commented Nov 28, 2015

kripken commented Nov 28, 2015

sunfishcode commented Nov 28, 2015

kripken commented Nov 29, 2015

AndrewScheidecker commented Nov 29, 2015

sunfishcode commented Dec 3, 2015

titzer commented Feb 25, 2016