Skip to content

Commit a6b4204

Browse files
author
Tomasz Kamiński
committed
[analyzer] Fix the liveness of Symbols for values in regions referred by LazyCompoundVal
To illustrate our current understanding, let's start with the following program: https://godbolt.org/z/33f6vheh1 ```lang=c++ void clang_analyzer_printState(); struct C { int x; int y; int more_padding; }; struct D { C c; int z; }; C foo(D d, int new_x, int new_y) { d.c.x = new_x; // B1 assert(d.c.x < 13); // C1 C c = d.c; // L assert(d.c.y < 10); // C2 assert(d.z < 5); // C3 d.c.y = new_y; // B2 assert(d.c.y < 10); // C4 return c; // R } ``` In the code, we create a few bindings to subregions of root region `d` (`B1`, `B2`), a constrain on the values (`C1`, `C2`, ….), and create a `lazyCompoundVal` for the part of the region `d` at point `L`, which is returned at point `R`. Now, the question is which of these should remain live as long the return value of the `foo` call is live. In perfect a word we should preserve: # only the bindings of the subregions of `d.c`, which were created before the copy at `L`. In our example, this includes `B1`, and not `B2`. In other words, `new_x` should be live but `new_y` shouldn’t. # constraints on the values of `d.c`, that are reachable through `c`. This can be created both before the point of making the copy (`L`) or after. In our case, that would be `C1` and `C2`. But not `C3` (`d.z` value is not reachable through `c`) and `C4` (the original value of`d.c.y` was overridden at `B2` after the creation of `c`). The current code in the `RegionStore` covers the use case (1), by using the `getInterestingValues()` to extract bindings to parts of the referred region present in the store at the point of copy. This also partially covers point (2), in case when constraints are applied to a location that has binding at the point of the copy (in our case `d.c.x` in `C1` that has value `new_x`), but it fails to preserve the constraints that require creating a new symbol for location (`d.c.y` in `C2`). We introduce the concept of //lazily copied// locations (regions) to the `SymbolReaper`, i.e. for which a program can access the value stored at that location, but not its address. These locations are constructed as a set of regions referred to by `lazyCompoundVal`. A //readable// location (region) is a location that //live// or //lazily copied// . And symbols that refer to values in regions are alive if the region is //readable//. For simplicity, we follow the current approach to live regions and mark the base region as //lazily copied//, and consider any subregions as //readable//. This makes some symbols falsy live (`d.z` in our example) and keeps the corresponding constraints alive. The rename `Regions` to `LiveRegions` inside `RegionStore` is NFC change, that was done to make it clear, what is difference between regions stored in this two sets. Regression Test: https://reviews.llvm.org/D134941 Co-authored-by: Balazs Benics <[email protected]> Reviewed By: martong, xazax.hun Differential Revision: https://reviews.llvm.org/D134947
1 parent 586fc59 commit a6b4204

File tree

4 files changed

+81
-11
lines changed

4 files changed

+81
-11
lines changed

clang/include/clang/StaticAnalyzer/Core/PathSensitive/SymbolManager.h

Lines changed: 15 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -582,7 +582,12 @@ class SymbolReaper {
582582
SymbolMapTy TheLiving;
583583
SymbolSetTy MetadataInUse;
584584

585-
RegionSetTy RegionRoots;
585+
RegionSetTy LiveRegionRoots;
586+
// The lazily copied regions are locations for which a program
587+
// can access the value stored at that location, but not its address.
588+
// These regions are constructed as a set of regions referred to by
589+
// lazyCompoundVal.
590+
RegionSetTy LazilyCopiedRegionRoots;
586591

587592
const StackFrameContext *LCtx;
588593
const Stmt *Loc;
@@ -628,8 +633,8 @@ class SymbolReaper {
628633

629634
using region_iterator = RegionSetTy::const_iterator;
630635

631-
region_iterator region_begin() const { return RegionRoots.begin(); }
632-
region_iterator region_end() const { return RegionRoots.end(); }
636+
region_iterator region_begin() const { return LiveRegionRoots.begin(); }
637+
region_iterator region_end() const { return LiveRegionRoots.end(); }
633638

634639
/// Returns whether or not a symbol has been confirmed dead.
635640
///
@@ -640,13 +645,20 @@ class SymbolReaper {
640645
}
641646

642647
void markLive(const MemRegion *region);
648+
void markLazilyCopied(const MemRegion *region);
643649
void markElementIndicesLive(const MemRegion *region);
644650

645651
/// Set to the value of the symbolic store after
646652
/// StoreManager::removeDeadBindings has been called.
647653
void setReapedStore(StoreRef st) { reapedStore = st; }
648654

649655
private:
656+
bool isLazilyCopiedRegion(const MemRegion *region) const;
657+
// A readable region is a region that live or lazily copied.
658+
// Any symbols that refer to values in regions are alive if the region
659+
// is readable.
660+
bool isReadableRegion(const MemRegion *region);
661+
650662
/// Mark the symbols dependent on the input symbol as live.
651663
void markDependentsLive(SymbolRef sym);
652664
};

clang/lib/StaticAnalyzer/Core/RegionStore.cpp

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2838,6 +2838,10 @@ void RemoveDeadBindingsWorker::VisitBinding(SVal V) {
28382838
// Is it a LazyCompoundVal? All referenced regions are live as well.
28392839
if (Optional<nonloc::LazyCompoundVal> LCS =
28402840
V.getAs<nonloc::LazyCompoundVal>()) {
2841+
// TODO: Make regions referred to by `lazyCompoundVals` that are bound to
2842+
// subregions of the `LCS.getRegion()` also lazily copied.
2843+
if (const MemRegion *R = LCS->getRegion())
2844+
SymReaper.markLazilyCopied(R);
28412845

28422846
const RegionStoreManager::SValListTy &Vals = RM.getInterestingValues(*LCS);
28432847

clang/lib/StaticAnalyzer/Core/SymbolManager.cpp

Lines changed: 16 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -411,10 +411,14 @@ void SymbolReaper::markLive(SymbolRef sym) {
411411
}
412412

413413
void SymbolReaper::markLive(const MemRegion *region) {
414-
RegionRoots.insert(region->getBaseRegion());
414+
LiveRegionRoots.insert(region->getBaseRegion());
415415
markElementIndicesLive(region);
416416
}
417417

418+
void SymbolReaper::markLazilyCopied(const clang::ento::MemRegion *region) {
419+
LazilyCopiedRegionRoots.insert(region->getBaseRegion());
420+
}
421+
418422
void SymbolReaper::markElementIndicesLive(const MemRegion *region) {
419423
for (auto SR = dyn_cast<SubRegion>(region); SR;
420424
SR = dyn_cast<SubRegion>(SR->getSuperRegion())) {
@@ -437,8 +441,7 @@ bool SymbolReaper::isLiveRegion(const MemRegion *MR) {
437441
// is not used later in the path, we can diagnose a leak of a value within
438442
// that field earlier than, say, the variable that contains the field dies.
439443
MR = MR->getBaseRegion();
440-
441-
if (RegionRoots.count(MR))
444+
if (LiveRegionRoots.count(MR))
442445
return true;
443446

444447
if (const auto *SR = dyn_cast<SymbolicRegion>(MR))
@@ -454,6 +457,15 @@ bool SymbolReaper::isLiveRegion(const MemRegion *MR) {
454457
return isa<AllocaRegion, CXXThisRegion, MemSpaceRegion, CodeTextRegion>(MR);
455458
}
456459

460+
bool SymbolReaper::isLazilyCopiedRegion(const MemRegion *MR) const {
461+
// TODO: See comment in isLiveRegion.
462+
return LazilyCopiedRegionRoots.count(MR->getBaseRegion());
463+
}
464+
465+
bool SymbolReaper::isReadableRegion(const MemRegion *MR) {
466+
return isLiveRegion(MR) || isLazilyCopiedRegion(MR);
467+
}
468+
457469
bool SymbolReaper::isLive(SymbolRef sym) {
458470
if (TheLiving.count(sym)) {
459471
markDependentsLive(sym);
@@ -464,7 +476,7 @@ bool SymbolReaper::isLive(SymbolRef sym) {
464476

465477
switch (sym->getKind()) {
466478
case SymExpr::SymbolRegionValueKind:
467-
KnownLive = isLiveRegion(cast<SymbolRegionValue>(sym)->getRegion());
479+
KnownLive = isReadableRegion(cast<SymbolRegionValue>(sym)->getRegion());
468480
break;
469481
case SymExpr::SymbolConjuredKind:
470482
KnownLive = false;

clang/test/Analysis/trivial-copy-struct.cpp

Lines changed: 46 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -49,11 +49,53 @@ void deadCode(List orig) {
4949
if (c.value == 42)
5050
return;
5151
clang_analyzer_value(c.value);
52-
// expected-warning@-1 {{32s:{ [-2147483648, 2147483647] }}}
53-
// The symbol was garbage collected too early, hence we lose the constraints.
52+
// expected-warning@-1 {{32s:{ [-2147483648, 41], [43, 2147483647] }}}
53+
// Before symbol was garbage collected too early, and we lost the constraints.
5454
if (c.value != 42)
5555
return;
5656

57-
// Dead code should be unreachable
58-
clang_analyzer_warnIfReached(); // expected-warning {{REACHABLE}}
57+
clang_analyzer_warnIfReached(); // no-warning: Dead code.
58+
};
59+
60+
void ptr1(List* n) {
61+
List* n2 = new List(*n); // ctor
62+
if (!n->next) {
63+
if (n2->next) {
64+
clang_analyzer_warnIfReached(); // unreachable
65+
}
66+
}
67+
delete n2;
68+
}
69+
70+
void ptr2(List* n) {
71+
List* n2 = new List(); // ctor
72+
*n2 = *n; // assignment
73+
if (!n->next) {
74+
if (n2->next) {
75+
clang_analyzer_warnIfReached(); // unreachable
76+
}
77+
}
78+
delete n2;
79+
}
80+
81+
struct Wrapper {
82+
List head;
83+
int count;
84+
};
85+
86+
void nestedLazyCompoundVal(List* n) {
87+
Wrapper* w = 0;
88+
{
89+
Wrapper lw;
90+
lw.head = *n;
91+
w = new Wrapper(lw);
92+
}
93+
if (!n->next) {
94+
if (w->head.next) {
95+
// FIXME: Unreachable, w->head is a copy of *n, therefore
96+
// w->head.next and n->next are equal
97+
clang_analyzer_warnIfReached(); // expected-warning {{REACHABLE}}
98+
}
99+
}
100+
delete w;
59101
}

0 commit comments

Comments
 (0)