-
Notifications
You must be signed in to change notification settings - Fork 1.8k
C++: IR Alias Analysis for smart pointers #5737
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! I mostly have nitpicky comments.
cpp/ql/src/semmle/code/cpp/ir/implementation/aliased_ssa/Instruction.qll
Show resolved
Hide resolved
cpp/ql/src/semmle/code/cpp/ir/implementation/aliased_ssa/internal/AliasAnalysis.qll
Outdated
Show resolved
Hide resolved
cpp/ql/src/semmle/code/cpp/models/implementations/SmartPointer.qll
Outdated
Show resolved
Hide resolved
`DataFlowFunction` models treat references a pointers - an explicit level of indirection. The AST dataflow library generally treats references as if they were the referred-to object. This commit removes a workaround in the dataflow model for unary `operator*` on smart pointers, and makes the AST dataflow library adjust the results of querying the model so that a returned reference only gets flow that was modeled as going to the dereference of the return value. This fixes some missing flow in IR dataflow, and recovers some (presumably) missing reverse taint flow in AST taint tracking as well.
Smart pointer constructors, assignments, and `reset()` can actually have fairly large side effects, especially with custom deleters, destructors for objects being destroyed, and so on. I've re-introduced `{AllAliased}` side effects for these functions. There was no immediate effect on analysis results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes LGTM!
If this PR only changed smart-pointer things I'd be happy to merge this now, but maybe it'd be good to run a CPP-differences to make sure that 3832100 didn't have any unfortunate consequences.
I've restarted the CPP-differences run: https://jenkins.internal.semmle.com/job/Changes/job/CPP-Differences/1945/ |
https://jenkins.internal.semmle.com/job/Changes/job/CPP-Differences/1948 failed again with an error:
I see that 97a13c9 is the commit that's being compared in the CPP-differences run, but it's not the HEAD of this PR (which might be why the job is failing). I've started another CPP-differences that compares the latest-merged |
CPP-Differences shows no changes, and I didn't really expect any. I think we're good. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM then!
This PR brings IR alias analysis of smart pointers more-or-less up to parity with the analysis of raw pointers, at least when building aliased SSA. I recommend reviewing commit-by-commit.
CallInstruction::getAParameterSideEffect()
to get the side effects for the specified parameter index.memory.h
into a sharedinclude
directory in the test tree, and added the necessary parts ofutility.h
andtype_traits.h
as well.AliasFunction
model to allow specifying pointer flow between arbitrary inputs and outputs. Note that pointer flow differs from data flow in that pointer flow is "must" flow, but data flow is "may" flow. We may want to consider consolidating these into a single model with aboolean
parameter someday.std::weak_ptr
toString()
of dynamic allocations stable, so that they can be used in inline expectations.There are still some taint test failures, which are caused because I stopped excluding
operator*()
from the list of modeled unwrapper functions. This is being discussed on Slack.