Skip to content

python: enable summaries from model #12581

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Jun 26, 2023

Conversation

yoff
Copy link
Contributor

@yoff yoff commented Mar 20, 2023

This requires a change to the shared interface:
Making getNodeFromPath public.

This because Python is doing its own thing and identifying call-backs.

I am unsure if this constitutes a feature yet, or if we should add a CSV parser first?

@yoff yoff force-pushed the python/enable-summaries-from-models branch from c3e6819 to efa36d2 Compare March 24, 2023 08:53
yoff added 4 commits June 18, 2023 21:52
This requires a change to the shared interface:
Making `getNodeFromPath` public.

This because Python is doing its own thing and identifying call-backs.
@yoff yoff force-pushed the python/enable-summaries-from-models branch from efa36d2 to 2296410 Compare June 18, 2023 20:02
yoff added 2 commits June 19, 2023 11:41
`base` is already the `CallNode` we want.
@yoff yoff marked this pull request as ready for review June 20, 2023 08:52
@yoff yoff requested review from a team as code owners June 20, 2023 08:52
Copy link
Member

@RasmusWL RasmusWL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 💪

Although a bit of a nitpick, can we please change Foo in the package name to foo? -- it just stands out as very non-standard.

I think it's a shame that the data-flow tests and taint-tracking tests are almost identical.

For taint-tracking we've mostly used InlineTaintTest.qll, so maybe we can use that alongside the data-flow test, and only have a single TestSummaries.qll file?

@RasmusWL
Copy link
Member

I looked a bit at where the existing NormalTaintTest was used, and the only case is https://github.com/github/codeql/blob/11c89adbe3b238e3a142256175f0003e19d7972b/python/ql/test/experimental/dataflow/summaries/summaries.py -- I think that could also benefit from being rewritten to have BOTH dataflow and taint-tracking tests, instead of only having taint tests.

also change `Foo` -> `foo`
@yoff
Copy link
Contributor Author

yoff commented Jun 20, 2023

I tried putting it in one file now, I agree we should consolidate all our summary tests at some point..

@yoff yoff requested a review from RasmusWL June 20, 2023 14:14
@calumgrant calumgrant requested a review from asgerf June 21, 2023 08:36
Copy link
Member

@RasmusWL RasmusWL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides the two code comments, I also have a stylistic recommendation for your Python code.

  1. put trailing commas after last argument (means diff for adding a new argument in the future is a bit more clean)
  2. do not indent closing parentheses (

Both follows black formatter (although it would try to put both arguments to ensure_tainted on one line 😮‍💨)

specifically it would change your code like this:

 ensure_tainted(
     tainted_list_el,  # $ tainted
-    tainted_list_el[0]  # $ tainted
-    )
+    tainted_list_el[0],  # $ tainted
+)

Comment on lines 38 to 40

tainted = MS_identity(TAINTED_STRING)
ensure_tainted(tainted) # $ tainted
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you have just shown that we have data-flow, and we know all dataflow steps are also taint-flow steps, I think it's fine to only check taint-flow for the cases where there is NOT dataflow.

I think that will make the test-file a bit easier to read as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I tried to make it more consistent now by not checking taint when dataflow is already established and when we do check taint, check both the collection and the expected element.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking some more about this, I think these tests should be about whether you can write flow summaries in CSV files that do the right thing.

Having all these extra taint steps are closer to what our current taint-tracking does in the end, but from my point of view, doesn't help us achieve the goal of the tests.

I removed the TAINTED_LIST parts locally, but thought it was a bit too controversial to just commit directly to your PR -- instead I've put the commits here: yoff#79 (if you agree, we can add these to main later on 👍)

yoff and others added 2 commits June 22, 2023 11:31
- do not test taint flow whne dataflow is established
- test taint of both the collection and the expected element
@yoff yoff requested a review from RasmusWL June 22, 2023 09:54
Copy link
Contributor

@asgerf asgerf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment otherwise LGTM

@yoff yoff requested a review from asgerf June 23, 2023 08:15
Copy link
Contributor

@asgerf asgerf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once the tests pass.

There seem to be some test expectations that need updating, probably due to some of the recent changes to inline test expectations.

Copy link
Member

@RasmusWL RasmusWL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved for now, tests can be adjusted later 👍

@RasmusWL RasmusWL merged commit 9c5aff3 into github:main Jun 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants