Skip to content

clang++ should always emit @llvm.trap() when flowing off the end of a non-void function #75797

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Eisenwave opened this issue Dec 18, 2023 · 5 comments · May be fixed by #109732
Open

clang++ should always emit @llvm.trap() when flowing off the end of a non-void function #75797

Eisenwave opened this issue Dec 18, 2023 · 5 comments · May be fixed by #109732
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc. confirmed Verified by a second party undefined behaviour

Comments

@Eisenwave
Copy link

Eisenwave commented Dec 18, 2023

Currently, when optimizations are enabled, flowing off the end of a function in C++ mode emits unreachable. This means that a function such as:

int get(int x) {
    if (x == 0) { }
    else std::abort();
}

... is equivalent to simply calling abort(), which is caused by emitting unreachable at the end of the function, and having SimplifyCFG eliminate the x == 0 branch. This behavior does not reflect developer intent, and it's possible to fall through into other code sections as a result of simply treating this as optimizable UB.

On debug builds, @llvm.trap() is emitted at the end of the function, which generates a ud2 pseudo-instruction on x86-64. This should also be done with optimizations enabled.

Even with no warning flags enabled, clang emits

<source>:6:1: warning: non-void function does not return a value in all control paths [-Wreturn-type]
    6 | }
      | ^

Flowing off the end of a function is a common developer mistake and basically never intentional. It should not be hidden by the optimizer because it may result in security vulnerabilities such as CVE-2014-9296.

If a developer actually wants the current behavior, they can simply write:

int get(int x) {
    if (x == 0) {
        __builtin_unreachable(); // or, since C++23
        std::unreachable(); // or, alternatively
        [[assume(false)]];
    }
    else std::abort();
}

There is zero motivation for violating programmer intent and emitting security vulnerabilities here. This offers no optimization opportunities that could not be trivially obtained with more explicit code.

@github-actions github-actions bot added the clang Clang issues not falling into any other category label Dec 18, 2023
@EugeneZelenko EugeneZelenko added clang:codegen IR generation bugs: mangling, exceptions, etc. and removed clang Clang issues not falling into any other category labels Dec 18, 2023
@llvmbot
Copy link
Member

llvmbot commented Dec 18, 2023

@llvm/issue-subscribers-clang-codegen

Author: Jan Schultke (Eisenwave)

Currently, when optimizations are enabled, flowing off the end of a function in C++ mode emits `unreachable`. This means that a function such as: ```cpp int get(int x) { if (x == 0) { } else std::abort(); } ``` ... is equivalent to simply calling `abort()`, which is caused by emitting `unreachable` at the end of the function, and having `SimplifyCFG` eliminate the `x == 0` branch. This behavior does not reflect developer intent, and it's possible to fall through into other code sections as a result of simply treating this as optimizable UB.

On debug builds, @<!-- -->llvm.trap() is emitted at the end of the function, which generates a ud2 pseudo-instruction on x86-64. This should also be done with optimizations enabled.

Even with no warning flags enabled, clang emits
> none &gt; &lt;source&gt;:6:1: warning: non-void function does not return a value in all control paths [-Wreturn-type] &gt; 6 | } &gt; | ^ &gt;

Flowing off the end of a function is a common developer mistake and basically never intentional. It should not be hidden by the optimizer because it may result in security vulnerabilities such as CVE-2014-9296.

If a developer actually wants the current behavior, they can simply write:

int get(int x) {
    if (x == 0) {
        __builtin_unreachable(); // or, since C++23
        std::unreachable(); // or, alternatively
        [[assume(false)]];
    }
    else std::abort();
}

There is zero motivation for violating programmer intent and emitting security vulnerabilities here. This offers no optimization opportunities that could not be trivially obtained with more explicit code.

@shafik
Copy link
Collaborator

shafik commented Dec 18, 2023

There are several cases in which invoking undefined behavior can get us into this unenviable position: #48943

This was also discussed here: https://discourse.llvm.org/t/can-we-keep-must-progress-optimizations-around-infinite-loop-in-c-while-avoiding-some-surprising-behavior/69205/5

I think the fix could be simple but measurement and testing is required to make sure this does not cause unintended regressions.

@shafik shafik added the confirmed Verified by a second party label Dec 18, 2023
@tbaederr
Copy link
Contributor

tbaederr commented Jan 2, 2024

Flowing off the end of a function is a common developer mistake and basically never intentional. It should not be hidden by the optimizer because it may result in security vulnerabilities such as CVE-2014-9296.

If that is true (and I think it is), the we should (additionally?) make that frontend warning (-Wreturn-type) an error.

@thesamesam
Copy link
Member

GCC defaults to trapping at -O0 and -Og (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104642).

@pcordes
Copy link

pcordes commented Jan 30, 2024

@Eisenwave wrote:

This offers no optimization opportunities that could not be trivially obtained with more explicit code.

Agreed, now that C++23 std::unreachable(); exists. Before that, portable code didn't have a good way to say "this page intentionally left blank". Although in practice such code should already have used #ifdef __GNUC__ / #define UNREACHABLE __buitin_unreachable() which covers many mainstream compilers so that's still a weak argument. But perhaps some codebases just used comments to document which } intentionally didn't have a preceding return.

Sometimes a path of execution isn't actually reachable but the compiler isn't able to prove that. Having the compiler waste code size on a ud2 (and worse, on the branch instructions to distinguish this case from other inputs) is a small cost.

So there is an optimization opportunity, but agreed that it should be opt-in with an explicit "we intentionally don't return" statement before the closing }.


@tbaederr wrote:

If that is true (and I think it is), the we should (additionally?) make that frontend warning (-Wreturn-type) an error.

Making it a runtime trap (x86 ud2) like at -O0 is sufficient to make it not a security vulnerability (other than potential DOS).

We could make it a compile-time error, but that would be inconsistent with how clang and other compilers handle other cases of compile-time-visible UB. Refusing to compile a valid C++ program is a big deal, not a step to be taken lightly. return (INT_MAX+1)/0 + *(int*)nullptr; in a function that's never called is not a problem for the C++ program containing that function, and neither is int never_called(){}.

What could perhaps be useful is a -Werror-security-relevant option that's like -Werror (promote warnings to errors) but only for a subset of warnings that someone decides are more likely to lead to security vulnerabilities than others. e.g. -Wreturn-type and some array index-range stuff, but maybe not operator-precedence parenthesis warnings.

The warning message for -Wreturn-type could perhaps mention std::unreachable if compiling with -std=gnu++23 or c++23, otherwise mention __builtin_unreachable? Probably not, that would clutter it further and most of the time that's not what the programmer wanted. And it's usually a pretty minor optimization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc. confirmed Verified by a second party undefined behaviour
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants