Skip to content

in release-safe mode, in functions that return an error, generate an error return instead of crash #426

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
andrewrk opened this issue Aug 20, 2017 · 8 comments
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@andrewrk
Copy link
Member

andrewrk commented Aug 20, 2017

When you create a release build of software, you don't want it to crash. Some people even argue in favor of not asserting in release builds, because they'd rather have the program in an undefined state than crash.

In release-fast mode, we don't do safety checks. Full optimization + undefined behavior. However you can override this default in a per block basis by using @setDebugSafety(this, true);. Then you'll have some release-safe parts of your code even in release-fast mode.

In release-safe mode, we do optimization but we turn on debug safety checks to prevent undefined behavior, such as integer overflow checking. We don't really know what else we could do when an integer overflow occurs besides crash the program.

...or do we?

In a function whose return value is error or %T, we could have that function return an error code, such as error.UndefinedBehavior. Assuming the programmer correctly uses the cleanup idioms, this error will be handled correctly, like the other possible errors in the function, even though the programmer never explicitly considered that this particular error was possible.

If this sounds like hidden control flow, I want to point out that we're talking about how to handle undefined behavior, and crashing the program is certainly also an example of hidden control flow.

When this situation happens, it's certainly a bug. But because the programmer (presumably) has a path to handle errors in these cases, it can be a bug that does not bring down the entire program or pose a security threat or instability to defined behavior of the program.

In the same way that you can override the default panic behavior by making pub fn panic(msg: []const u8) -> noreturn in the root source file, we can have another optional function such as pub fn warn(err: error) which defaults to printing a stack trace to stderr, but can be used to log these bugs. Obtaining a stack trace from the warn function would yield exactly the location that caused the problem.

Because Zig is so adamant about edge cases - for example insisting that memory allocation can fail - it is very common for functions to return a possible error. This makes the surface area that this covers fairly large.

@andrewrk andrewrk added the enhancement Solving this issue will likely involve adding new logic or components to the codebase. label Aug 20, 2017
@andrewrk andrewrk added this to the 0.2.0 milestone Aug 20, 2017
@thejoshwolfe
Copy link
Contributor

If I start with a function that does not error, like:

    pub fn area(self: &Rect) -> i32 {
        self.width * self.height
    }

how do i take advantage of this feature to catch the overflow?

    pub fn area(self: &Rect) -> %i32 { // ERROR: function doesn't actually return an error
        self.width * self.height
    }

@andrewrk
Copy link
Member Author

The second function actually is allowed. Consider if you have a list of function pointers with prototype fn (&Rect) -> %i32 and some of them return errors, you should still be able to have a function in the list that has this prototype and does not return an error.

But even considering the first function, sure, it's an example where this proposal wouldn't affect anything and we'd still have to crash inside area. Let's look at a real world example, std/debug.zig which is implementing reading DWARF info and printing a stack trace:

[nix-shell:~/dev/zig/build]$ grep '\bfn\b' ../std/debug.zig  | wc -l
40
[nix-shell:~/dev/zig/build]$ grep '\bfn\b' ../std/debug.zig | grep '\-> %' | wc -l
30

Here 75% of functions have an error return and would be covered by this feature.

@thejoshwolfe
Copy link
Contributor

This makes me think that people will develop "best practices" which is always make functions errorable, and call all functions with %return, even if there's apparently no error case. This turns zig's safety features into paranoia and boilerplate clutter.

This proposal is starting to approach the Java-style pokemon error handling, where an event loop just catches all exceptions from each event handler and continues running regardless of what's going wrong. Is that what we want with this proposal?

Stack-unwinding exception handling is optimized optimistically, but zig's if-driven error handling is not. If the "best" way to write zig code is to make everything errorable, zig would be better off with C++-style exceptions.

@andrewrk
Copy link
Member Author

But we're not asking users to make everything an error union return value. We're just asking ourselves, what's the correct thing for zig to do when debug safety is violated in release mode?

@thejoshwolfe
Copy link
Contributor

what's the correct thing for zig to do when debug safety is violated in release mode?

Call the panic function. If users want to implement "continue and hope for the best" like Java pokemon programming, they can have their panic function longjmp back to a known state or something.

There's no generally correct solution to handling assertion failures in production. I'm not comfortable implementing or suggesting a language-level half solution. Any half solutions that go into production should be implemented and understood by the application maintainers.

I might have a different opinion if I had more experience in this domain; I'm still approaching this project as an idealist. I'd like to see more evidence of how this feature would be valuable.

@raulgrell
Copy link
Contributor

raulgrell commented Aug 22, 2017

I don't know how I feel about this - release-safe is still safe, and safety means we shouldn't continue running the program when it's in an undefined state. Calling a panic function which can be overriden seems like the most prudent idea.

If I switch on the error union and cover every possible error, without an else clause, where would my error.UndefinedBehaviour get handled? Would that be valid?

Also, from the Zen of Zig: Runtime crashes are better than bugs.

@andrewrk
Copy link
Member Author

If people don't like it, I won't do it. It's certainly easier to keep things crashing, as status quo. But let's talk through it some more.

Also, from the Zen of Zig: Runtime crashes are better than bugs.

Damn. Tough to argue with this. (Not being sarcastic, I really love that you're bringing this up.)

In order for this proposal to be a good idea, I need to argue that it is not a bug to handle undefined behavior this way. I think this is related to your next question below:

If I switch on the error union and cover every possible error, without an else clause, where would my error.UndefinedBehaviour get handled? Would that be valid?

In this case, it would be handled the same way that every switch in ReleaseSafe mode which has no else clause is handled. There is a default else => unreachable. So the function would once again bubble up "unreachable code reached", or if the function is not error or %T, it would crash by calling the panic function.

Call the panic function. If users want to implement "continue and hope for the best" like Java pokemon programming, they can have their panic function longjmp back to a known state or something.

There's no generally correct solution to handling assertion failures in production. I'm not comfortable implementing or suggesting a language-level half solution. Any half solutions that go into production should be implemented and understood by the application maintainers.

This proposal is offering something a little bit more robust. The premise is that the user uses zig's idioms perfectly, and functions which have an %T signature use the %defer / defer features, so that any line of code that reads return error.Foo is guaranteed to clean up correctly.

Given these premises, and further given that we're deciding what to do when undefined behavior is invoked, it seems reasonable to me to take advantage of this already existing error handling mechanism for undefined behavior that we are catching.

Consider, if, when you type %T into a function's return value, you are taking on the responsibility that any code you might type into the function could return from the function with error.UndefinedBehavior. To me this does not add any complexity into the understanding of how the application will behave. We already need the calling function to correctly handle an error value; what extra difficulty does it add to have one more possible error code?

As a reminder, we're talking about what happens when there is a bug in the programmer's code that is invoking undefined behavior. The programmer has (accidentally or purposefully) agreed that however we handle the situation is valid according to the semantics defined by them.

And then finally, I'll say what I said earlier, again:

If people don't like it, I won't do it.

@robinei
Copy link

robinei commented Sep 16, 2017

It just seems a little too ad hoc, since a lot of places where undefined behavior can happen will be in functions that do not return an error union. So the same code in different contexts will either crash the program or cause an error to be returned.

And this will force people to have to code in an "exception safe" way which they won't have to otherwise, because undefined behavior can hide where you don't expect it. defer should be a tool to make it easier to write correct code, but not required. It seems bad to throw out C philosophy that all control flow is visible, which has so far been kept in Zig.

@andrewrk andrewrk modified the milestones: 0.2.0, 0.3.0 Oct 19, 2017
@andrewrk andrewrk added proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. rejected and removed enhancement Solving this issue will likely involve adding new logic or components to the codebase. labels Feb 9, 2018
@andrewrk andrewrk closed this as completed Feb 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests

4 participants