Skip to content

Proposal: Disallow unlabeled blocks as block expressions #9059

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
SpexGuy opened this issue Jun 10, 2021 · 15 comments
Open

Proposal: Disallow unlabeled blocks as block expressions #9059

SpexGuy opened this issue Jun 10, 2021 · 15 comments
Labels
breaking Implementing this issue could cause existing code to no longer compile or have different behavior. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@SpexGuy
Copy link
Contributor

SpexGuy commented Jun 10, 2021

An example similar to this recently came up on the discord:

fn getPoint(direction: u1) Point {
  const T = @TypeOf({});
  var p: Point = .{ .x = 0, .y = 0 };
  const tt = [_]T {{p.x +%= 1;}, {p.y +%= 1;}};
  tt[direction];
  return p;
}

The author expected it to mean this:
T is the type of a block expression.
tt is an array of block expressions.
when tt[index] is encountered, the expression at that index executes.
So getPoint(0) returns (1, 0) and getPoint(1) returns (0, 1)

Surprisingly, this code compiles, but has a very different interpretation.
T is void.
tt is nothing, the expressions execute immediately.
when tt[index] is encountered, it's a void value, so discarding it is not a compile error.
So getPoint always returns (1, 1) for any input.

I think in general, block expressions which don't return a value are misleading. Some more examples of code like this:

const x = {
  y += 1;
};
someFunction({ x += 4; });

In my opinion, we should require blocks which are expressions to be labeled, and only allow statement blocks to be unlabeled. This would clear up the confusion about the above code, by making it a compile error.

There is one use case for an unlabeled block expression, which is the use of {} as the value of type void. There are equivalent expressions, like @as(void, undefined) and void{}, but this one is particularly succinct. I think void{} is sufficient, but we could choose to continue to allow block expressions with no statements inside.

@SpexGuy SpexGuy added breaking Implementing this issue could cause existing code to no longer compile or have different behavior. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. labels Jun 10, 2021
@Vexu Vexu added this to the 0.9.0 milestone Jun 10, 2021
@ghost
Copy link

ghost commented Jun 10, 2021

Void-valued expression blocks have a legitimate use-case in #6965. And also in switch statements:

switch (action) {
    .Inc => { x += 1; y += 1; },
    .Dec => { x -= 1; y -= 1; },
    else => {},
}

@SpexGuy
Copy link
Contributor Author

SpexGuy commented Jun 10, 2021

Those two specific cases should still be allowed, in the same way that block statements are still allowed, but we should disallow unlabeled blocks in switch expressions.

Actually there's another use case, which is blocks that have the type noreturn. For example:

const y = some_optional orelse {
   logError();
   return null;
};

I don't think we can reasonably require a label on this sort of block.
So I guess we also have to except these. The rules then are

  • block expressions must be labeled unless they are noreturn.
  • switch statements contain statements (without semicolon), not expressions.

@ghost
Copy link

ghost commented Jun 10, 2021

I don't like the idea of introducing a formal distinction between switch statements and switch expressions. For example, would the inner switch in the following code be an expression-switch or a statement-switch?

switch (flag) {
    .Foo => switch (action) {
        .Inc => { x += 1; y += 1; },
        .Dec => { x -= 1; y -= 1; },
        else => {},
    },
    else => {},
}

On one hand, the whole thing is a statement. On the other, the inner switch occurs in an expression context. The current semantics governing this is easy to explain and reason about, while the proposed change would require various special cases and non-local analysis. I'd say this is not worth it given the somewhat exotic nature of the motivating example.

@SpexGuy
Copy link
Contributor Author

SpexGuy commented Jun 10, 2021

would the inner switch in the following code be an expression-switch or a statement-switch?

This question is easily answered from the above rules:

switch statements contain statements (without semicolon), not expressions.

Since the outer switch is a statement, the inner switch is also a statement.

@ghost
Copy link

ghost commented Jun 10, 2021

Since the outer switch is a statement, the inner switch is also a statement.

Makes sense. But that would require introducing separate definitions for SwitchExpr/SwitchStmt, IfExpr/IfStmt, WhileExpr/WhileStmt, etc. into the grammar. If we are still trying to keep it small, I'd say this is a step in the wrong direction.

@SpexGuy
Copy link
Contributor Author

SpexGuy commented Jun 10, 2021

We already have this split for If/For/While. Statements allow assignment operations as the payload, expressions do not. Switch is the only block construct that currently doesn't have this split.

@ghost
Copy link

ghost commented Jun 10, 2021

We already have this split for If/For/While. [..] Switch is the only block construct that currently doesn't have this split.

Ok, that's less of a problem than I thought, then.

What about the noreturn case? That looks like it would create a parse-time dependency on type inference.

@SpexGuy
Copy link
Contributor Author

SpexGuy commented Jun 10, 2021

The noreturn case would mean that this error can only happen after typechecking. It's almost possible to do the check in astgen, except for functions which return noreturn.

@ghost
Copy link

ghost commented Jun 10, 2021

So, implementation-wise, unlabeled expression blocks would continue to be accepted by the parser, but then rejected in a separate pass after typechecking confirms that they are indeed void-valued?

@ghost
Copy link

ghost commented Jun 10, 2021

In that case, would it make sense to disallow void expressions entirely? Are there actual use cases for void variables and arrays?

@Hadron67
Copy link
Contributor

Hadron67 commented Jun 11, 2021

Are there actual use cases for void variables and arrays?

const StringHashSet = std.StringHashMap(void);

Also, there are more places where void blocks should be allowed: suspend, defer, errdefer, comptime, and
WhileContinueExpr, like:

while (...) : ({ a += 1; b += 1; }){ ... }

@SpexGuy
Copy link
Contributor Author

SpexGuy commented Jun 11, 2021

suspend, defer, errdefer, comptime, while continue

These are all statements (places where you could put an assignment), except comptime which can appear in both contexts and would need this split to propagate expression-ness down the AST.

@Hadron67
Copy link
Contributor

These are all statements (places where you could put an assignment)

Ok then, so their arguments are BlockExprStatement and what we disallow is BlockExpr so they are still allowed? Does this mean we need to check for void block expressions in expressions instead of checking all BlockExpr in the AST in a context-free manner?

ps: WhileContinueExpr seems like an expression instead of statement:

WhileContinueExpr <- COLON LPAREN AssignExpr RPAREN

@ghost
Copy link

ghost commented Jun 11, 2021

WhileContinueExpr seems like an expression instead of statement:

I think it's just named this way in the grammar. It is semantically a statement, since it allows assignment and does not return a value.

Does this mean we need to check for void block expressions in expressions instead of checking all BlockExpr in the AST in a context-free manner?

Apparently so. The procedure would be to walk down the typed AST and take note of subtrees that are rooted in the RHS of an assignment or in a function argument. These would be treated as proper Expressions rather than ExpressionStatements. Void-valued unlabeled blocks would be allowed in ExpressionStatement subtrees, but would result in an error in Expression subtrees, unless they are noreturn.

Addendum: One place where this could still create problems is metaprogramming. E.g. @TypeOf({}) would no longer be allowed, unless special-cased.

@Validark
Copy link
Contributor

Validark commented Aug 3, 2023

My $0.02, I personally don't see any real footgun here relating to blocks. While this is a funny example of someone not knowing that {} is a void literal, I don't think anyone could reasonably expect an array to hold expression statements. Arrays hold data, and that is prerequisite knowledge for programming in Zig or pretty much any language I have ever dealt with. I think that code was probably made by someone who was learning their first language, which is totally fine! It takes time to learn and develop an intuition about programming. However, I don't think the language should force everyone to bend over backwards and label every block just because some people have not yet learned that arrays are not switch statements.

@andrewrk andrewrk removed this from the 0.14.0 milestone Feb 9, 2025
@andrewrk andrewrk added this to the 0.15.0 milestone Feb 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Implementing this issue could cause existing code to no longer compile or have different behavior. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests

5 participants