-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
formalize the panic interface #21520
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
implements #17969
I had to bring back some of the old API so that I could compile the new compiler with an old compiler.
now that we have a zig1.wasm update it's not needed
motivated by performance
in the llvm backend.
Introduces `std.builtin.Panic` which is a complete interface for panicking. Provide `std.debug.FormattedPanic` and `std.debug.SimplePanic` and let the user choose, or make their own.
instead of panicking
although they would also pass simply reverted to master branch because I made the deprecated API still work for now (to be removed after 0.14.0 is tagged)
d775165
to
7f4c0e0
Compare
try ip.getOrPutString(gpa, pt.tid, inner_name, .no_embedded_nulls), | ||
) orelse return sema.fail(block, src, "std.builtin missing {s}", .{inner_name}); | ||
try sema.ensureNavResolved(src, nav); | ||
return Value.fromInterned(ip.getNav(nav).status.resolved.val); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mlugg should this be calling analyzeNavVal
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but also, see Sema.namespaceLookupVal
which just wraps these two calls.
Don't worry too much about incremental dependencies for std.builtin
stuff at the minute -- incremental updates to std.builtin
(and also, more importantly, updates that change the user-provided panic handler) are very broken as-is, this is on my todo list for incremental.
Follow-up: I changed the approach as outlined above, and now the difference vs master branch looks like this: Debug builds of the compiler binary size (same):
ReleaseSafe builds of the compiler compiling hello world (insignificant difference):
Now there's a formal interface defined for all the panic helpers, and a One thing that's neat about this approach is that you can omit any given panic function and receive a compile error if a call to that function would have been generated. For compiler-rt we pass empty struct (when not in test mode)! |
I haven't figured out this crash yet:
|
try ip.getOrPutString(gpa, pt.tid, inner_name, .no_embedded_nulls), | ||
) orelse return sema.fail(block, src, "std.builtin missing {s}", .{inner_name}); | ||
try sema.ensureNavResolved(src, nav); | ||
return Value.fromInterned(ip.getNav(nav).status.resolved.val); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but also, see Sema.namespaceLookupVal
which just wraps these two calls.
Don't worry too much about incremental dependencies for std.builtin
stuff at the minute -- incremental updates to std.builtin
(and also, more importantly, updates that change the user-provided panic handler) are very broken as-is, this is on my todo list for incremental.
#20240 is loosely related to this. I'd quite like to see a decision made on that proposal (I'm far from wedded to my specific ideas there, I just want to decide how the panic handler works across TUs). |
I want to say that this closes #20240 because it makes that problem fully solvable outside the language specification and compiler implementation. It could be quite valuable to have different panic implementations in different compilation units - consider for example fuzz testing where you have the fuzzer compilation unit ( For other use cases where you want the same implementation, any given compilation unit can be marked as the main one that gets the debug info handling code and exports a symbol for other compilations to call into when they panic - just like you would do when mixing Zig and C code. Unless there can be a more compelling problem statement, I think that proposal can be rejected because the use cases it addresses are already perfectly solvable without it. |
better names, return error instead of panicking, better diagnostics, use the standard APIs for resolving values
what does this mean? did I regress the machine code size or something? |
This experimental target has no recent active maintainer. It's the only linker backend complaining about this branch and I can't make sense of the stack trace. This can be fixed asynchronously by anyone who wants to maintain plan9 support. It does not need to block this branch.
/// This namespace is used by the Zig compiler to emit various kinds of safety | ||
/// panics. These can be overridden by making a public `Panic` namespace in the | ||
/// root source file. | ||
pub const Panic: type = if (@hasDecl(root, "Panic")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a namespace instead of a comptime struct like std.Options
to avoid generating unused decls? Shouldn't only accessing the fields at comptime already avoid unused default values being codegened?
Overall I like this change (especially the point about the importance of being able to override compiler-generated functions), and it solves the problem with formatted panics in environments with tiny stack spaces since you can now override the default calls to But why is the panic override is a namespace and not a struct? With it being a namespace, it would seem that it will run into the same problems with discoverability and locality of error messages as If // std.builtin
pub const panic: Panic = if (@hasDecl(root, "panic")) root.panic else std.debug.formatted_panic;
pub const Panic = struct {
call: fn (msg: []const u8, error_return_trace: ?*const StackTrace, first_trace_addr: ?usize) noreturn,
sentinelMismatch: fn (expected: anytype, found: @TypeOf(expected)) noreturn,
unwrapError: fn (error_return_trace: ?*StackTrace, err: anyerror) noreturn,
outOfBounds: fn (index: usize, len: usize) noreturn,
startGreaterThanEnd: fn (start: usize, end: usize) noreturn,
inactiveUnionField: fn (active: anytype, accessed: @TypeOf(active)) noreturn,
messages: Messages = .{},
pub const Messages = struct {
reached_unreachable: []const u8 = "reached unreachable code",
unwrap_null: []const u8 = "attempt to use null value",
// ...
noreturn_returned: []const u8 = "'noreturn' function returned",
};
};
// root
pub const panic: std.builtin.Panic = .{
.call = customCallImpl,
.sentinelMismatch = customSentinelMismatchImpl,
// ...
.inactiveUnionField = customInactiveUnionFieldImpl,
}; The only immediate problem I see with this struct strategy is that But maybe there are other problems with the struct strategy (codegen?) that makes the namespace strategy the better choice in the end. (Nit: If it ends up remaining a namespace, it should be |
The same reason that I ended up ditching Maybe it would be OK since the value would be comptime-known, however it also loses the nice property that undesired kinds of panics can be suppressed by omitting the corresponding function. For example this patch does this in compiler-rt: // Avoid dragging in the runtime safety mechanisms into this .o file, unless
// we're trying to test compiler-rt.
pub const Panic = if (builtin.is_test) std.debug.FormattedPanic else struct {}; This makes it into a compile error if a compiler_rt function would have generated any safety checks. So I think this idea would require making all the function pointers and messages optional. Or maybe it can be made to work with Anyway I think it would be worth a try, if it was definitely better ergonomics to do so. Is it for sure though? I might want to hold out for proper interfaces instead of trying to hack one. That change also sounds like a pain in the butt to make so I would want to do that as a follow-up, i.e. merge this and move on to other priorities for now. |
in case someone wants to pursue the idea of making the panic interface a struct, this will reduce churn.
080fd9d
to
2857ca1
Compare
@mlugg in our voice call I indicated that I wanted to explore the tagged union approach again by trying to make the codegen better for it. However, I think it's still needed to use the helper functions for better safety codegen, particularly for "access of inactive union field". pub fn inactiveUnionField(active: anytype, accessed: @TypeOf(active)) noreturn {
@branchHint(.cold);
std.debug.panicExtra(null, @returnAddress(), "access of union field '{s}' while field '{s}' is active", .{
@tagName(accessed), // triggers an additional safety check
@tagName(active), // triggers an additional safety check
});
} The problem with the PanicCause approach is that the icache cost of those two additional safety checks will be paid at every union field access. Meanwhile the helper function accomplishes moving those two additional safety checks to the body of a cold function. It also means that Also in the call you strongly suggested to take the struct approach as outlined by @castholm above, however after sleeping on it, I've come to the same conclusion that I came to before chatting with you: this is a step in the right direction, that should be merged now, and the idea of using the struct is a next step that will come with its own challenges. I will leave that to someone else to implement. I have barely enough motivation remaining on this branch to finish it as-is. I will push one final commit to reduce churn, reverting the test cases to rely on the deprecated old definitions, with the recommendation that people don't update their code until 0.14.0 is tagged, in case a follow-up improvement happens. |
Hm, okay, I see the issue with the tagged union approach there. I'm still not sure if I'm entirely satisfied with this solution, however, merging as-is for now on the basis that this represents an improvement on status quo seems reasonable to me. Likewise for the struct-vs-namespace issue; I stand by my opinion that an actual struct would be better, but I'm happy to get this in as-is for now. |
was a pattern where the tag of the pub fn panic(
comptime tag: std.meta.Tag(std.builtin.PanicCause),
payload: std.meta.TagPayload(std.builtin.PanicCause, tag),
_: ?*std.builtin.StackTrace,
_: ?usize,
) noreturn {
@branchHint(.cold);
switch (tag) {
.unreach => std.debug.panicExtra(null, @returnAddress(), "unreachable", .{}),
.inactive_union_field => std.debug.panicExtra(null, @returnAddress(), "access of union field '{s}' while field '{s}' is active", .{
@tagName(payload.accessed),
@tagName(payload.active),
}),
else => @compileError("panic not handled"),
}
} |
Can you additionally share the source for |
My specific question (which I think may be Andrew's too) is: what is the payload of |
the same as you outlined in #17969/the original implementation of this pr pub const PanicCause = union(enum) {
...
inactive_union_field: InactiveUnionField,
...
pub const InactiveUnionField = struct {
active: []const u8,
accessed: []const u8,
};
}; so in a call to i did write that snippet without realizing you had made this struct hold the names themselves which explains your earlier comment that i got that part of the code from, i suppose that part may complicate it if thats what your question is meaning |
It is very complex. The many-functions interface was the most competitive with the I thought to mention it in the summary of #19764; that a many-functions interface could achieve the smallest fail block because no aggregate instantiation was necessary, but I also thought that might cause confusion, and that this would undersell the advantages of the single-function interface (which are really significant at scale). In other words, yes it is the best overall, but it is not clear-cut. Now that it seems that maintainers are more open to the idea of a many-functions interface I might attempt a more thorough investigation. |
surely im doing this wrong because this is ass
lib/std/debug.zig:491:38: error: slice of non-array type 'u16' utf16_buffer[len_minus_3][0..3].* = .{ '\r', '\n', 0 }; ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~ lib/std/debug.zig:510:70: error: expected type '?*const anyopaque', found '[]u16' _ = bs.exit(uefi.handle, .Aborted, exit_msg.len + 1, exit_data); ^~~~~~~~~ Regressed in ziglang#21520.
lib/std/debug.zig:491:38: error: slice of non-array type 'u16' utf16_buffer[len_minus_3][0..3].* = .{ '\r', '\n', 0 }; ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~ lib/std/debug.zig:510:70: error: expected type '?*const anyopaque', found '[]u16' _ = bs.exit(uefi.handle, .Aborted, exit_msg.len + 1, exit_data); ^~~~~~~~~ Regressed in ziglang#21520.
lib/std/debug.zig:491:38: error: slice of non-array type 'u16' utf16_buffer[len_minus_3][0..3].* = .{ '\r', '\n', 0 }; ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~ lib/std/debug.zig:510:70: error: expected type '?*const anyopaque', found '[]u16' _ = bs.exit(uefi.handle, .Aborted, exit_msg.len + 1, exit_data); ^~~~~~~~~ Regressed in ziglang#21520.
lib/std/debug.zig:491:38: error: slice of non-array type 'u16' utf16_buffer[len_minus_3][0..3].* = .{ '\r', '\n', 0 }; ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~ lib/std/debug.zig:510:70: error: expected type '?*const anyopaque', found '[]u16' _ = bs.exit(uefi.handle, .Aborted, exit_msg.len + 1, exit_data); ^~~~~~~~~ Regressed in ziglang#21520.
The primary goals of this patchset are to eliminate the
-fformatted-panics
and-fno-formatted-panics
command line options, eliminate the mandatory dependency on formatted printing of safety panics generated by the compiler, and eliminate the internal compiler features corresponding to these things.A secondary goal is to make overriding the panic handler be a smoother use case.
Note that panic messages are unchanged; the default panic handler produces identical messages but without a dependency on
std.fmt
.At first, this patchset removed all the panic helpers, making the compiler directly emit calls to
std.builtin.panic
only. Unfortunately this caused a performance regression in ReleaseSafe builds. For example, here is ReleaseSafe builds of the compiler, master branch vs this, compiling hello world, and this is even with the new code cheating a little bit in the llvm backend with respect to integer overflow panics:So I reintroduced the
std.builtin
panic helpers (but without calling into formatted printing), such asstd.builtin.panicUnwrapError
, and gained the perf back. However the new code is still cheating with LLVM backend integer overflow safety checks.ReleaseSafe builds of the compiler, after re-introducing std.builtin helpers:
ReleaseSafe builds of the compiler, after re-instating the LLVM integer overflow safety calls:
That's the final state of the perf in this branch for ReleaseSafe. ReleaseFast is not affected by these changes. Unfortunately the binary size is bigger:
ReleaseSafe builds of the compiler (+3M):
Closes #17969
I am putting this up for discussion but I am not sure I want to merge it.
An alternative would be to do these things on top of the patchset before merging it:
std.builtin.Panic
namespace so that they can all be overridden together, making it still easy to swap out panicking with one symbol. I think it's important for any function that the compiler automatically generates calls to to be user-overridable.-fno-formatted-panics
but is done via overriding std.builtin symbols rather than compiler flags.I am leaning towards doing all of these bullet points before merging, based on the key observation that the
std.builtin
panic helpers are particularly effective at generating efficient code. I did not take that into account when accepting #17969.