Python: Regexp: Handle repetions {n} (with no ,) #3500

yoff · 2020-05-18T12:50:11Z

To solve FP report #2403 with internal issue https://github.com/github/codeql-python-team/issues/85.
The code currently only handles the syntax {n,m}. This adds support for {n}.

yoff · 2020-05-19T07:05:00Z

As Taus mentioned, we might consider checking if there is a dual problem with caret and if solving that should then be part of this PR.

python/ql/test/query-tests/Expressions/Regex/test.py

As ar as I can tell, all these are improvements

yoff · 2020-05-20T06:24:55Z

python/ql/test/library-tests/regex/Regex.expected

@@ -207,9 +207,9 @@
 | ax{3,} | sequence | 0 | 6 |
 | ax{3} | char | 0 | 1 |
 | ax{3} | char | 1 | 2 |
-| ax{3} | char | 2 | 3 |
 | ax{3} | char | 3 | 4 |
 | ax{3} | char | 4 | 5 |


I am not sure why 4-5 is in there, but both that and removing 2-3 is consistent with how {n,m} is handled..

tausbn · 2020-05-20T11:50:04Z

I was perusing the Python re module documentation just now (as you do), and I noticed the \A and \Z escapes, which match the start and end of the string respectively. Thus, something like

>>> re.match("(\Aab$|\Aba$)$\Z", "ab")
<_sre.SRE_Match object; span=(0, 2), match='ab'>

works as expected. I don't know that we're explicitly handling these cases here, so perhaps a few more tests would be in order? 🙂

yoff · 2020-05-20T12:06:44Z

Looks like lastPart has an explicit case for \$ and would need one for \Z.

yoff · 2020-05-26T12:09:00Z

Made the test @tausbn sugested pass, the code feels a bit ad-hock, though.

python/ql/src/semmle/python/regex.qll

RasmusWL

Besides autoformatting and a maybe adding a test-case, it all looks good to me 👍

python/ql/src/semmle/python/regex.qll

yoff · 2020-06-24T09:03:50Z

I added https://github.com/github/codeql-python-team/issues/114 to track the fact that, as @tausbn observed, we probably have the dual problem with carets.

to make CodeScan happy

RasmusWL

Still LGTM 🚀

Python: Regexp: Handle repetions {n} (with no ,)

b56545b

yoff added Python false-positive labels May 18, 2020

yoff requested a review from a team as a code owner May 18, 2020 12:50

RasmusWL reviewed May 19, 2020

View reviewed changes

python/ql/test/query-tests/Expressions/Regex/test.py Outdated Show resolved Hide resolved

Python: Update test expectations.

4d6ad32

As ar as I can tell, all these are improvements

yoff commented May 20, 2020

View reviewed changes

yoff added 2 commits May 26, 2020 08:07

Python: re test with \Z

f1efdee

Python: re, handle \Z

6b168de

yoff commented Jun 10, 2020

View reviewed changes

python/ql/src/semmle/python/regex.qll Outdated Show resolved Hide resolved

yoff requested a review from RasmusWL June 10, 2020 17:50

Python: link to FP report in test file

b5703cd

RasmusWL requested changes Jun 11, 2020

View reviewed changes

python/ql/src/semmle/python/regex.qll Outdated Show resolved Hide resolved

python/ql/src/semmle/python/regex.qll Show resolved Hide resolved

yoff added 2 commits June 24, 2020 10:48

Python: format

226c295

Python: test zero iterations

6e9c48b

Merge branch 'master' of github.com:github/codeql into UnmatchableDollar

f6c59ab

to make CodeScan happy

yoff requested a review from RasmusWL June 24, 2020 10:17

RasmusWL approved these changes Jun 25, 2020

View reviewed changes

RasmusWL merged commit b36c23e into github:master Jun 25, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Python: Regexp: Handle repetions {n} (with no ,) #3500

Python: Regexp: Handle repetions {n} (with no ,) #3500

Uh oh!

yoff commented May 18, 2020

Uh oh!

yoff commented May 19, 2020

Uh oh!

Uh oh!

yoff May 20, 2020

Uh oh!

tausbn commented May 20, 2020

Uh oh!

yoff commented May 20, 2020

Uh oh!

yoff commented May 26, 2020

Uh oh!

Uh oh!

RasmusWL left a comment

Uh oh!

Uh oh!

Uh oh!

yoff commented Jun 24, 2020

Uh oh!

RasmusWL left a comment

Uh oh!

Uh oh!

Python: Regexp: Handle repetions {n} (with no ,) #3500

Python: Regexp: Handle repetions {n} (with no ,) #3500

Uh oh!

Conversation

yoff commented May 18, 2020

Uh oh!

yoff commented May 19, 2020

Uh oh!

Uh oh!

yoff May 20, 2020

Choose a reason for hiding this comment

Uh oh!

tausbn commented May 20, 2020

Uh oh!

yoff commented May 20, 2020

Uh oh!

yoff commented May 26, 2020

Uh oh!

Uh oh!

RasmusWL left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

yoff commented Jun 24, 2020

Uh oh!

RasmusWL left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!