-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Sema: rewrite semantic analysis of function calls #22414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
456f5be
to
9acff76
Compare
5b130ec
to
383b8ff
Compare
383b8ff
to
d5c1789
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- please provide a perf data point
- please note the before/after compiler recursion stack limit (e.g. how much comptime recursive fibonacci can you do before crashing the compiler)
Performance Data PointsAnalyze
|
This rewrite improves some error messages, hugely simplifies the logic, and fixes several bugs. One of these bugs is technically a new rule which Andrew and I agreed on: if a parameter has a comptime-only type but is not declared `comptime`, then the corresponding call argument should not be *evaluated* at comptime; only resolved. Implementing this required changing how function types work a little, which in turn required allowing a new kind of function coercion for some generic use cases: function coercions are now allowed to implicitly *remove* `comptime` annotations from parameters with comptime-only types. This is okay because removing the annotation affects only the call site. Resolves: ziglang#22262
d5c1789
to
e9bd2d4
Compare
Before the prior commit, the maximum comptime recursion depth on my system was 4062. After the prior commit, it decreased to 2854. This commit increases the compiler's stack size enough so that the recursion depth limit is no less than it was before the `Sema.analyzeCall` rewrite, preventing this from being a breaking change. Specifically, this stack size increases my observed maximum comptime recursion depth to 4105.
c6f4bbc
to
6a837e6
Compare
It seems that this has broken ZLS build
zig: 0.14.0-dev.2634+b36ea592b |
ZLS is unaffiliated with the Zig project -- please open an issue on the ZLS repo instead. |
The original motivation here was to fix regressions caused by ziglang#22414. However, while working on this, I ended up discussing a language simplification with Andrew, which changes things a little from how they worked before ziglang#22414. The main user-facing change here is that any reference to a prior function parameter, even if potentially comptime-known at the usage site or even not analyzed, now makes a function generic. This applies even if the parameter being referenced is not a `comptime` parameter, since it could still be populated when performing an inline call. This is a breaking language change. The detection of this is done in AstGen; when evaluating a parameter type or return type, we track whether it referenced any prior parameter, and if so, we mark this type as being "generic" in ZIR. This will cause Sema to not evaluate it until the time of instantiation or inline call. A lovely consequence of this from an implementation perspective is that it eliminates the need for most of the "generic poison" system. In particular, `error.GenericPoison` is now completely unnecessary, because we identify generic expressions earlier in the pipeline; this simplifies the compiler and avoids redundant work. This also entirely eliminates the concept of the "generic poison value". The only remnant of this system is the "generic poison type" (`Type.generic_poison` and `InternPool.Index.generic_poison_type`). This type is used in two places: * During semantic analysis, to represent an unknown result type. * When storing generic function types, to represent a generic parameter/return type. It's possible that these use cases should instead use `.none`, but I leave that investigation to a future adventurer. One last thing. Prior to ziglang#22414, inline calls were a little inefficient, because they re-evaluated even non-generic parameter types whenever they were called. Changing this behavior is what ultimately led to ziglang#22538. Well, because the new logic will mark a type expression as generic if there is any change its resolved type could differ in an inline call, this redundant work is unnecessary! So, this is another way in which the new design reduces redundant work and complexity. Resolves: ziglang#22494 Resolves: ziglang#22532 Resolves: ziglang#22538
The original motivation here was to fix regressions caused by #22414. However, while working on this, I ended up discussing a language simplification with Andrew, which changes things a little from how they worked before #22414. The main user-facing change here is that any reference to a prior function parameter, even if potentially comptime-known at the usage site or even not analyzed, now makes a function generic. This applies even if the parameter being referenced is not a `comptime` parameter, since it could still be populated when performing an inline call. This is a breaking language change. The detection of this is done in AstGen; when evaluating a parameter type or return type, we track whether it referenced any prior parameter, and if so, we mark this type as being "generic" in ZIR. This will cause Sema to not evaluate it until the time of instantiation or inline call. A lovely consequence of this from an implementation perspective is that it eliminates the need for most of the "generic poison" system. In particular, `error.GenericPoison` is now completely unnecessary, because we identify generic expressions earlier in the pipeline; this simplifies the compiler and avoids redundant work. This also entirely eliminates the concept of the "generic poison value". The only remnant of this system is the "generic poison type" (`Type.generic_poison` and `InternPool.Index.generic_poison_type`). This type is used in two places: * During semantic analysis, to represent an unknown result type. * When storing generic function types, to represent a generic parameter/return type. It's possible that these use cases should instead use `.none`, but I leave that investigation to a future adventurer. One last thing. Prior to #22414, inline calls were a little inefficient, because they re-evaluated even non-generic parameter types whenever they were called. Changing this behavior is what ultimately led to #22538. Well, because the new logic will mark a type expression as generic if there is any change its resolved type could differ in an inline call, this redundant work is unnecessary! So, this is another way in which the new design reduces redundant work and complexity. Resolves: #22494 Resolves: #22532 Resolves: #22538
The original motivation here was to fix regressions caused by ziglang#22414. However, while working on this, I ended up discussing a language simplification with Andrew, which changes things a little from how they worked before ziglang#22414. The main user-facing change here is that any reference to a prior function parameter, even if potentially comptime-known at the usage site or even not analyzed, now makes a function generic. This applies even if the parameter being referenced is not a `comptime` parameter, since it could still be populated when performing an inline call. This is a breaking language change. The detection of this is done in AstGen; when evaluating a parameter type or return type, we track whether it referenced any prior parameter, and if so, we mark this type as being "generic" in ZIR. This will cause Sema to not evaluate it until the time of instantiation or inline call. A lovely consequence of this from an implementation perspective is that it eliminates the need for most of the "generic poison" system. In particular, `error.GenericPoison` is now completely unnecessary, because we identify generic expressions earlier in the pipeline; this simplifies the compiler and avoids redundant work. This also entirely eliminates the concept of the "generic poison value". The only remnant of this system is the "generic poison type" (`Type.generic_poison` and `InternPool.Index.generic_poison_type`). This type is used in two places: * During semantic analysis, to represent an unknown result type. * When storing generic function types, to represent a generic parameter/return type. It's possible that these use cases should instead use `.none`, but I leave that investigation to a future adventurer. One last thing. Prior to ziglang#22414, inline calls were a little inefficient, because they re-evaluated even non-generic parameter types whenever they were called. Changing this behavior is what ultimately led to ziglang#22538. Well, because the new logic will mark a type expression as generic if there is any change its resolved type could differ in an inline call, this redundant work is unnecessary! So, this is another way in which the new design reduces redundant work and complexity. Resolves: ziglang#22494 Resolves: ziglang#22532 Resolves: ziglang#22538
This rewrite improves some error messages, hugely simplifies the logic, and fixes several bugs. One of these bugs is technically a new rule which Andrew and I agreed on: if a parameter has a comptime-only type but is not declared
comptime
, then the corresponding call argument should not be evaluated at comptime; only resolved. Implementing this required changing how function types work a little, which in turn required allowing a new kind of function coercion for some generic use cases: function coercions are now allowed to implicitly removecomptime
annotations from parameters with comptime-only types. This is okay because removing the annotation affects only the call site.Resolves: #22262
This is a breaking change, because it fixes a bug relating to eval branch quota. Previously, the usage of a "child Sema" when analyzing a generic function's call site meant we accidentally "duplicated" the eval branch quota. We no longer do this (I've actually concluded that the concept of a "child Sema" isn't really coherent), so a little more eval branch quota is used up on evaluating generic parameter/return types. That's why the changes to
lib/compiler/aro_translate_c.zig
andsrc/translate_c.zig
were necessary. #22410 is not resolved by this PR; the buggy behavior of status quo is left intact.I was considering also reworking the ZIR for function declarations in this PR, since it's very overcomplicated. However, I decided to leave that for now, since I'm planning to start playing around with a biiiig ZIR rewrite soon.