Skip to content

C++: Fix indirect global-variable flow #14736

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Nov 10, 2023

Conversation

MathiasVP
Copy link
Contributor

@MathiasVP MathiasVP commented Nov 9, 2023

The logic for handling indirections of global variables was all wrong 😭. This PR hopefully fixes this logic 🤞.

The first commit adds a bunch of tests. I've then split the fix into a sequence of small commits. Finally, the last commit accepts the test changes.

There are a couple of new inconsistencies regarding missing toString on some esoteric syntax. I haven't investigated those yet, but I don't think they should block this PR.

@jketema I'm requesting a review from you because you initially identified these missing steps. Let me know what I can do to help the review process here!

@github-actions github-actions bot added the C++ label Nov 9, 2023
@MathiasVP MathiasVP force-pushed the fix-global-indirect-flow branch from 01c5c58 to 39b9d2e Compare November 9, 2023 20:29
@MathiasVP
Copy link
Contributor Author

MathiasVP commented Nov 10, 2023

Here's my investigations so far:

cpp/unbounded-write on vim looks good and are genuinely new results. The new results are caused by us recognizing flow from the output argument: https://github.com/vim/vim/blob/a5a1ec1826c0e43d0282ba4d35c155a97bab3e27/src/option.c#L2016

to this strncpy that stores to a global: https://github.com/vim/vim/blob/a5a1ec1826c0e43d0282ba4d35c155a97bab3e27/src/option.c#L2088

and then flows to another use of the global here: https://github.com/vim/vim/blob/a5a1ec1826c0e43d0282ba4d35c155a97bab3e27/src/memline.c#L2568

The new cpp/system-data-exposure results are also because we recognize more writes to this global variable. In this case we see that name[0] (and name[1]) are written to the global in this function: https://github.com/vim/vim/blob/a5a1ec1826c0e43d0282ba4d35c155a97bab3e27/src/term.c#L6229

... And the new cpp/tainted-format-string are also because of this global variable 😂 (and follows pretty much the same path as the cpp/unbounded-write ones).

The new Kamailio result is interesting! Unfortunately, the global variable of interest is in a generated file so I can't link to the code. Here are some screenshots, though:

We write to a global variable here: Screenshot 2023-11-10 at 11 45 29

yy_text_ptr is a macro that expands to the yy_text global that is defined here:
Screenshot 2023-11-10 at 11 45 44

That global is then used almost immediately by a write to another global yy_last_accepting_cpos here:
Screenshot 2023-11-10 at 11 46 19

Which then taints the yy_last_accepting_cpos global:
Screenshot 2023-11-10 at 11 46 31

The flows all look genuine, so I'm happy with all of these new results 🎉

@MathiasVP MathiasVP marked this pull request as ready for review November 10, 2023 11:54
@MathiasVP MathiasVP requested a review from a team as a code owner November 10, 2023 11:54
@MathiasVP MathiasVP added the no-change-note-required This PR does not need a change note label Nov 10, 2023
@MathiasVP MathiasVP requested a review from jketema November 10, 2023 12:24
@jketema
Copy link
Contributor

jketema commented Nov 10, 2023

  • bb5a78d also seem to do a replacement at the very bottom, not just a refactoring. What's the purpose of that replacement?
  • fd26ae1 seems to do something different than its commit message advertises?

@MathiasVP
Copy link
Contributor Author

  • bb5a78d also seem to do a replacement at the very bottom, not just a refactoring. What's the purpose of that replacement?

Oops. You're right. The bottom line in that commit should actually have been part of fd26ae1.

  • fd26ae1 seems to do something different than its commit message advertises?

I think the commit message is accurate. The commit changes the implementation of getSourceVariable for the GlobalUse class so that it obtains the SSA variable (which can be thought of as a regular variable along with an indirection count) by calculating the indirection count using the getIndirection member predicate (instead of the indirection index).

For consistency this commit also implements a getIndirection member predicate for GlobalDefs and uses that to implement getSourceVariable the same way as GlobalUse does it, but since getIndirection() just returns the indirection index this doesn't actually change anything in this case.

Comment on lines 150 to 149
indirectionIndex = [0 .. defIndex] + 1
isDef(_, _, _, vai, indirection, indirectionIndex)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change seems to offset the final argument of isDef by 1, is that correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. So after this change any write to a global variable with indirection index indirectionIndex generates a GlobalUse SSA read at the end of the enclosing function with indirection index indirectionIndex.

Copy link
Contributor

@jketema jketema left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Note that I had some 1:1 discussion with @MathiasVP, which clarified most.

@MathiasVP MathiasVP merged commit 2ceb4cf into github:main Nov 10, 2023
bdrodes pushed a commit to microsoft/codeql that referenced this pull request Nov 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C++ no-change-note-required This PR does not need a change note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants