clang++ should always emit `@llvm.trap()` when flowing off the end of a non-void function #75797

Eisenwave · 2023-12-18T14:10:26Z

Currently, when optimizations are enabled, flowing off the end of a function in C++ mode emits unreachable. This means that a function such as:

int get(int x) {
    if (x == 0) { }
    else std::abort();
}

... is equivalent to simply calling abort(), which is caused by emitting unreachable at the end of the function, and having SimplifyCFG eliminate the x == 0 branch. This behavior does not reflect developer intent, and it's possible to fall through into other code sections as a result of simply treating this as optimizable UB.

On debug builds, @llvm.trap() is emitted at the end of the function, which generates a ud2 pseudo-instruction on x86-64. This should also be done with optimizations enabled.

Even with no warning flags enabled, clang emits

<source>:6:1: warning: non-void function does not return a value in all control paths [-Wreturn-type]
    6 | }
      | ^

Flowing off the end of a function is a common developer mistake and basically never intentional. It should not be hidden by the optimizer because it may result in security vulnerabilities such as CVE-2014-9296.

If a developer actually wants the current behavior, they can simply write:

int get(int x) {
    if (x == 0) {
        __builtin_unreachable(); // or, since C++23
        std::unreachable(); // or, alternatively
        [[assume(false)]];
    }
    else std::abort();
}

There is zero motivation for violating programmer intent and emitting security vulnerabilities here. This offers no optimization opportunities that could not be trivially obtained with more explicit code.

The text was updated successfully, but these errors were encountered:

llvmbot · 2023-12-18T15:24:54Z

@llvm/issue-subscribers-clang-codegen

Author: Jan Schultke (Eisenwave)

Currently, when optimizations are enabled, flowing off the end of a function in C++ mode emits `unreachable`. This means that a function such as: ```cpp int get(int x) { if (x == 0) { } else std::abort(); } ``` ... is equivalent to simply calling `abort()`, which is caused by emitting `unreachable` at the end of the function, and having `SimplifyCFG` eliminate the `x == 0` branch. This behavior does not reflect developer intent, and it's possible to fall through into other code sections as a result of simply treating this as optimizable UB.

On debug builds, @llvm.trap() is emitted at the end of the function, which generates a ud2 pseudo-instruction on x86-64. This should also be done with optimizations enabled.

Even with no warning flags enabled, clang emits
> none > <source>:6:1: warning: non-void function does not return a value in all control paths [-Wreturn-type] > 6 | } > | ^ >

Flowing off the end of a function is a common developer mistake and basically never intentional. It should not be hidden by the optimizer because it may result in security vulnerabilities such as CVE-2014-9296.

If a developer actually wants the current behavior, they can simply write:

int get(int x) {
    if (x == 0) {
        __builtin_unreachable(); // or, since C++23
        std::unreachable(); // or, alternatively
        [[assume(false)]];
    }
    else std::abort();
}

There is zero motivation for violating programmer intent and emitting security vulnerabilities here. This offers no optimization opportunities that could not be trivially obtained with more explicit code.

shafik · 2023-12-18T16:44:27Z

There are several cases in which invoking undefined behavior can get us into this unenviable position: #48943

This was also discussed here: https://discourse.llvm.org/t/can-we-keep-must-progress-optimizations-around-infinite-loop-in-c-while-avoiding-some-surprising-behavior/69205/5

I think the fix could be simple but measurement and testing is required to make sure this does not cause unintended regressions.

tbaederr · 2024-01-02T06:03:20Z

Flowing off the end of a function is a common developer mistake and basically never intentional. It should not be hidden by the optimizer because it may result in security vulnerabilities such as CVE-2014-9296.

If that is true (and I think it is), the we should (additionally?) make that frontend warning (-Wreturn-type) an error.

thesamesam · 2024-01-02T06:10:07Z

GCC defaults to trapping at -O0 and -Og (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104642).

pcordes · 2024-01-30T07:07:59Z

@Eisenwave wrote:

This offers no optimization opportunities that could not be trivially obtained with more explicit code.

Agreed, now that C++23 std::unreachable(); exists. Before that, portable code didn't have a good way to say "this page intentionally left blank". Although in practice such code should already have used #ifdef __GNUC__ / #define UNREACHABLE __buitin_unreachable() which covers many mainstream compilers so that's still a weak argument. But perhaps some codebases just used comments to document which } intentionally didn't have a preceding return.

Sometimes a path of execution isn't actually reachable but the compiler isn't able to prove that. Having the compiler waste code size on a ud2 (and worse, on the branch instructions to distinguish this case from other inputs) is a small cost.

So there is an optimization opportunity, but agreed that it should be opt-in with an explicit "we intentionally don't return" statement before the closing }.

@tbaederr wrote:

If that is true (and I think it is), the we should (additionally?) make that frontend warning (-Wreturn-type) an error.

Making it a runtime trap (x86 ud2) like at -O0 is sufficient to make it not a security vulnerability (other than potential DOS).

We could make it a compile-time error, but that would be inconsistent with how clang and other compilers handle other cases of compile-time-visible UB. Refusing to compile a valid C++ program is a big deal, not a step to be taken lightly. return (INT_MAX+1)/0 + *(int*)nullptr; in a function that's never called is not a problem for the C++ program containing that function, and neither is int never_called(){}.

What could perhaps be useful is a -Werror-security-relevant option that's like -Werror (promote warnings to errors) but only for a subset of warnings that someone decides are more likely to lead to security vulnerabilities than others. e.g. -Wreturn-type and some array index-range stuff, but maybe not operator-precedence parenthesis warnings.

The warning message for -Wreturn-type could perhaps mention std::unreachable if compiling with -std=gnu++23 or c++23, otherwise mention __builtin_unreachable? Probably not, that would clutter it further and most of the time that's not what the programmer wanted. And it's usually a pretty minor optimization.

github-actions bot added the clang Clang issues not falling into any other category label Dec 18, 2023

EugeneZelenko added clang:codegen IR generation bugs: mangling, exceptions, etc. and removed clang Clang issues not falling into any other category labels Dec 18, 2023

shafik added the confirmed Verified by a second party label Dec 18, 2023

thesamesam added the undefined behaviour label Dec 31, 2023

duk-37 linked a pull request Sep 23, 2024 that will close this issue

[WIP] [CodeGen] Enable TrapUnreachable by default for all targets. #109732

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

clang++ should always emit `@llvm.trap()` when flowing off the end of a non-void function #75797

clang++ should always emit `@llvm.trap()` when flowing off the end of a non-void function #75797

Eisenwave commented Dec 18, 2023 •

edited

Loading

llvmbot commented Dec 18, 2023

shafik commented Dec 18, 2023

tbaederr commented Jan 2, 2024

thesamesam commented Jan 2, 2024

pcordes commented Jan 30, 2024

clang++ should always emit @llvm.trap() when flowing off the end of a non-void function #75797

clang++ should always emit @llvm.trap() when flowing off the end of a non-void function #75797

Comments

Eisenwave commented Dec 18, 2023 • edited Loading

llvmbot commented Dec 18, 2023

shafik commented Dec 18, 2023

tbaederr commented Jan 2, 2024

thesamesam commented Jan 2, 2024

pcordes commented Jan 30, 2024

clang++ should always emit `@llvm.trap()` when flowing off the end of a non-void function #75797

clang++ should always emit `@llvm.trap()` when flowing off the end of a non-void function #75797

Eisenwave commented Dec 18, 2023 •

edited

Loading