Skip to content

clang fails to eliminate redundant bounds check from std::vector::push_back #87010

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
efriedma-quic opened this issue Mar 28, 2024 · 7 comments

Comments

@efriedma-quic
Copy link
Collaborator

efriedma-quic commented Mar 28, 2024

Consider the following:

#include <vector>
void f1(std::vector<int>&x) { x.push_back(1);}

Compile with -O2 -stdlib=libc++, and looks at the resulting assembly. (https://godbolt.org/z/d3GzvEGY4)

There are two bounds checks on the size of the vector: the first in the implementation of std::vector itself, which checks that the length of the vector isn't too large, and calls __throw_length_error if it fails. Then, there's a check inside the allocator that the allocation isn't too large, and calls __throw_bad_array_new_length if it fails. But the second check shouldn't be necessary: if the vector isn't too large, the allocation also isn't too large, since "too large" is defined exactly the same way for both checks.

(Briefly discussed in #64132, but I'm breaking it out into its own issue for clarity.)

CC @fhahn @hiraditya @philnik777

@github-actions github-actions bot added the clang Clang issues not falling into any other category label Mar 28, 2024
@EugeneZelenko EugeneZelenko added llvm:optimizations missed-optimization and removed clang Clang issues not falling into any other category labels Mar 28, 2024
@antoniofrighetto
Copy link
Contributor

antoniofrighetto commented Mar 29, 2024

Maybe I missed something while trying to reduce this, but the underlying case seems to be already handled by InstCombine: https://alive2.llvm.org/ce/z/5Ew7Fo.

@antoniofrighetto
Copy link
Contributor

Partially reduced to the following (https://llvm.godbolt.org/z/PxTons86M):

define i64 @src(ptr %p, i64 %C, i64 %D, i64 %E) {
entry:
  %cond = icmp ult i64 %D, 4611686018427387903
  tail call void @llvm.assume(i1 %cond)                  ; Assume %D always <u 4611686018427387903
  %1 = ptrtoint ptr %p to i64
  %2 = sub i64 %1, %C
  %cond2 = icmp ult i64 %2, 9223372036854775804
  %3 = ashr exact i64 %2, 1                              ; New upper bound for %2: <u 4611686018427387902
  %4 = tail call i64 @llvm.umax.i64(i64 %3, i64 %D)      ; %4 cannot be >u 4611686018427387903
  %5 = select i1 %cond2, i64 %4, i64 4611686018427387903 ; Either 4611686018427387903 or lower
  %6 = icmp eq i64 %5, 0
  br i1 %6, label %return, label %val_not_zero

val_not_zero:                                          ; preds = %entry
  %8 = icmp ugt i64 %5, 4611686018427387903              ; %8 cannot be >u 4611686018427387903, cmp always false
  br i1 %8, label %bad_array_length, label %add_and_call

bad_array_length:                                      ; preds = %val_not_zero
  call void @bad_array_new_length()
  unreachable

add_and_call:                                          ; preds = %val_not_zero
  %11 = add i64 %E, 2
  tail call void @opaque_2(i64 noundef %11)
  br label %return

return:                                                ; preds = %add_and_call, %entry
  %res = call i64 @opaque(i64 %D)
  ret i64 %res
}

As the comparison %8 in basic block val_not_zero could not be merged in entry, InstCombine fails at simplifying %8 into false. The optimization succeeds when add_and_call can be currently simplified (e.g., when having an unconditional jump to return: https://llvm.godbolt.org/z/53Gc8bMsE). I'm not exactly sure if the missed optimization is in SimplifyCFG (e.g., we may speculatively execute %8 in entry with a bunch of select, I don't think it's possible here though) or can be done elsewhere (e.g., in ConstraintElim). Any thoughts? cc/ @dtcxzyw

@dtcxzyw
Copy link
Member

dtcxzyw commented Apr 2, 2024

We can fix this by adding support for %2 <u 9223372036854775804 --> %4 <u 4611686018427387903 (at use) in LazyValueInfoImpl::getValueFromICmpCondition. However, it may be blocked by the following code :(

// If the value is undef, a different value may be chosen in
// the select condition.
if (isGuaranteedNotToBeUndef(Cond, AC)) {

@antoniofrighetto
Copy link
Contributor

We can fix this by adding support for %2 <u 9223372036854775804 --> %4 <u 4611686018427387903 (at use) in LazyValueInfoImpl::getValueFromICmpCondition. However, it may be blocked by the following code :(

Unfortunately that seems to be the case when the select instruction gets processed :( Guess there's no other way but to mark the parameters as noundef, would that still make sense?

@dtcxzyw
Copy link
Member

dtcxzyw commented Apr 3, 2024

Guess there's no other way but to mark the parameters as noundef

Yeah, we cannot mark the loaded ptr %p as noundef :(

@antoniofrighetto
Copy link
Contributor

OK my bad, still reproducible, the issue does not show up only when linking against GNU C++ stdlib.

@antoniofrighetto
Copy link
Contributor

Ping @nikic, any alternatives for this by any chance (as LVI doesn't seem to be an option)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants