Skip to content

Segfault Before Main with Self-hosted Compiler #13190

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ominitay opened this issue Oct 16, 2022 · 7 comments
Open

Segfault Before Main with Self-hosted Compiler #13190

ominitay opened this issue Oct 16, 2022 · 7 comments
Labels
bug Observed behavior contradicts documented or intended behavior frontend Tokenization, parsing, AstGen, Sema, and Liveness.
Milestone

Comments

@ominitay
Copy link
Contributor

Zig Version

0.10.0-dev.4418+99c3578f6

Steps to Reproduce

Attempt to build and run the attached Zig program with stage2 in either Debug or ReleaseSafe mode.

Expected Behavior

The program runs and outputs an ascii representation of the mandelbrot set.

Actual Behavior

The program segfaults before reaching main.

@ominitay ominitay added the bug Observed behavior contradicts documented or intended behavior label Oct 16, 2022
@ominitay
Copy link
Contributor Author

This does not occur in the other two release modes, or in any release mode using the stage1 compiler.

@ominitay
Copy link
Contributor Author

ominitay commented Oct 16, 2022

The file is available at this link, since GitHub doesn't allow me to attach Zig files. http://0x0.st/owzu.zig
The source file isn't very readable, and is about 3kloc long, as it was generated by a brainfuck compiler.

@mikdusan
Copy link
Member

mikdusan commented Oct 16, 2022

This issue appears to manifest with stage2 but not with stage1. To be clear I am not sure this is a bug, it could just be that stage2 inlines better all the comptime known stuff.

Basically, stage2 generates a large 24 MiB frame which exceeds most default stack size limits. This compares to stage1 frame less than 64 KiB:

stage2 (25,381,504 bytes)

z1.main:
.Lfunc_begin15:
	.loc	22 3 0
	.cfi_startproc
	push	rbp
	.cfi_def_cfa_offset 16
	.cfi_offset rbp, -16
	mov	rbp, rsp
	.cfi_def_cfa_register rbp
	mov	eax, 25381504
	call	__zig_probe_stack

stage1 (57,680 bytes)

main:
.Lfunc_begin424:
	.file	51 "/home/mike/project/zig/work/main" "z1.zig"
	.loc	51 3 0
	.cfi_startproc
	push	rbp
	.cfi_def_cfa_offset 16
	.cfi_offset rbp, -16
	mov	rbp, rsp
	.cfi_def_cfa_register rbp
	mov	eax, 57680
	call	__zig_probe_stack

To work around this, you can force the cell array size to be "runtime" by making it a fn arg. Here's a quick patch:

--- z1.zig	2022-10-16 10:16:24.720551872 -0400
+++ z3.zig	2022-10-16 11:13:29.293411952 -0400
@@ -2,6 +2,10 @@

 pub fn main() void {
     var cells = std.mem.zeroes([30000]u8);
+    doit(&cells);
+}
+
+fn doit(cells: []u8) void {
     var index: u32 = 0;
     const stdout = std.io.getStdOut().writer();

@mikdusan
Copy link
Member

for posterity OP's .zig source can be found here: owzu.zip

@ominitay
Copy link
Contributor Author

ominitay commented Oct 16, 2022

This issue appears to manifest with stage2 but not with stage1. To be clear I am not sure this is a bug, it could just be that stage2 inlines better all the comptime known stuff.

Even if the problem were caused by the compiler excessively inlining, that would still be a bug imo. It's diverging behaviour between release modes, and causing valid Zig code to exhibit UB (stack overflow).

To work around this, you can force the cell array size to be "runtime" by making it a fn arg. Here's a quick patch:

Thanks for that, I should be able to use this workaround to get things working for the time being!

@Vexu
Copy link
Member

Vexu commented Oct 16, 2022

I suspect this to be yet another case of #12215

@matu3ba
Copy link
Contributor

matu3ba commented Oct 17, 2022

At the very least, the compiler should

warn or what should the compiler do?

The problem
Note, that the stack size can be dynamically adjusted, so to get things right an "expected stack size" would need to be given by the user inside build.zig.
For example, threads can set own stack size: https://stackoverflow.com/questions/1678803/how-to-determine-stack-size-of-a-program-in-linux via pthread_attr_setstacksize() and change the Kernel defaults https://linux.die.net/man/2/setrlimit.
Other OSes should have similar functionality.
Handling all cases would require 1. no runtime sandbox to get Kernel access to current used option, 2. no stuff like pthread_attr_setstacksize or somehow computing the limit of what can be used at any possible runtime.

This is still a minimum needed runtime stack size which may be grossly wrong, the maximum runtime stack size can only be over-approximated.

Summary
With that said, I think its not worth adding a check as local optimum for only this case unless we can rule out more classes of broken programs with user input on the program behavior.

Proposal

I think its more viable to provide a way to print out compile-time known minimal needed stack size,
as the used frame size might be just below the threshold and we dont know how much runtime stack size the user wants/needs.

@andrewrk andrewrk added the frontend Tokenization, parsing, AstGen, Sema, and Liveness. label Oct 31, 2022
@andrewrk andrewrk added this to the 0.10.1 milestone Oct 31, 2022
@andrewrk andrewrk modified the milestones: 0.10.1, 0.11.0 Jan 10, 2023
@andrewrk andrewrk modified the milestones: 0.11.0, 0.11.1 Jul 20, 2023
@andrewrk andrewrk modified the milestones: 0.11.1, 0.12.0, 0.13.0 Jan 29, 2024
@andrewrk andrewrk modified the milestones: 0.14.0, 0.15.0 Feb 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Observed behavior contradicts documented or intended behavior frontend Tokenization, parsing, AstGen, Sema, and Liveness.
Projects
None yet
Development

No branches or pull requests

5 participants