Skip to content

Commit a718ad6

Browse files
committed
Stop using Python's eval() for -m and -k
Previously, the expressions given to the `-m` and `-k` options were evaluated with `eval`. This causes a few issues: - Python keywords cannot be used. - Constants like numbers, None, True, False are not handled correctly. - Various syntax like numeric operators and `X if Y else Z` is supported unintentionally. - `eval()` is somewhat dangerous for arbitrary input. - Can fail in many ways so requires `except Exception`. The format we want to support is quite simple, so change to a custom parser. This fixes the issues above, and gives us full control of the format, so can be documented comprehensively and even be extended in the future if we wish.
1 parent be68496 commit a718ad6

File tree

6 files changed

+405
-75
lines changed

6 files changed

+405
-75
lines changed

changelog/7122.breaking.rst

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
Expressions given to the ``-m`` and ``-k`` options are no longer evaluated using Python's ``eval()``.
2+
The format supports ``or``, ``and``, ``not``, parenthesis and general identifiers to match against.
3+
Python constants, keywords or other operators are no longer evaluated differently.

doc/en/example/markers.rst

Lines changed: 3 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -141,14 +141,14 @@ Or select multiple nodes:
141141
Using ``-k expr`` to select tests based on their name
142142
-------------------------------------------------------
143143

144-
.. versionadded: 2.0/2.3.4
144+
.. versionadded:: 2.0/2.3.4
145145

146146
You can use the ``-k`` command line option to specify an expression
147147
which implements a substring match on the test names instead of the
148148
exact match on markers that ``-m`` provides. This makes it easy to
149149
select tests based on their names:
150150

151-
.. versionadded: 5.4
151+
.. versionchanged:: 5.4
152152

153153
The expression matching is now case-insensitive.
154154

@@ -198,20 +198,8 @@ Or to select "http" and "quick" tests:
198198
199199
===================== 2 passed, 2 deselected in 0.12s ======================
200200
201-
.. note::
202-
203-
If you are using expressions such as ``"X and Y"`` then both ``X`` and ``Y``
204-
need to be simple non-keyword names. For example, ``"pass"`` or ``"from"``
205-
will result in SyntaxErrors because ``"-k"`` evaluates the expression using
206-
Python's `eval`_ function.
207-
208-
.. _`eval`: https://docs.python.org/3.6/library/functions.html#eval
209-
201+
You can use ``and``, ``or``, ``not`` and parentheses.
210202

211-
However, if the ``"-k"`` argument is a simple string, no such restrictions
212-
apply. Also ``"-k 'not STRING'"`` has no restrictions. You can also
213-
specify numbers like ``"-k 1.3"`` to match tests which are parametrized
214-
with the float ``"1.3"``.
215203

216204
Registering markers
217205
-------------------------------------

src/_pytest/mark/expression.py

Lines changed: 173 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,173 @@
1+
r"""
2+
Evaluate match expressions, as used by `-k` and `-m`.
3+
4+
The grammar is:
5+
6+
expression: expr? EOF
7+
expr: and_expr ('or' and_expr)*
8+
and_expr: not_expr ('and' not_expr)*
9+
not_expr: 'not' not_expr | '(' expr ')' | ident
10+
ident: (\w|:|\+|-|\.|\[|\])+
11+
12+
The semantics are:
13+
14+
- Empty expression evaluates to False.
15+
- ident evaluates to True of False according to a provided matcher function.
16+
- or/and/not evaluate according to the usual boolean semantics.
17+
"""
18+
import enum
19+
import re
20+
from typing import Callable
21+
from typing import Iterator
22+
from typing import Optional
23+
from typing import Sequence
24+
25+
import attr
26+
27+
from _pytest.compat import TYPE_CHECKING
28+
29+
if TYPE_CHECKING:
30+
from typing import NoReturn
31+
32+
33+
__all__ = [
34+
"evaluate",
35+
"ParseError",
36+
]
37+
38+
39+
class TokenType(enum.Enum):
40+
LPAREN = "left parenthesis"
41+
RPAREN = "right parenthesis"
42+
OR = "or"
43+
AND = "and"
44+
NOT = "not"
45+
IDENT = "identifier"
46+
EOF = "end of input"
47+
48+
49+
@attr.s(frozen=True, slots=True)
50+
class Token:
51+
type = attr.ib(type=TokenType)
52+
value = attr.ib(type=str)
53+
pos = attr.ib(type=int)
54+
55+
56+
class ParseError(Exception):
57+
"""The expression contains invalid syntax.
58+
59+
:param column: The column in the line where the error occurred (1-based).
60+
:param message: A description of the error.
61+
"""
62+
63+
def __init__(self, column: int, message: str) -> None:
64+
self.column = column
65+
self.message = message
66+
67+
def __str__(self) -> str:
68+
return "at column {}: {}".format(self.column, self.message)
69+
70+
71+
class Scanner:
72+
__slots__ = ("tokens", "current")
73+
74+
def __init__(self, input: str) -> None:
75+
self.tokens = self.lex(input)
76+
self.current = next(self.tokens)
77+
78+
def lex(self, input: str) -> Iterator[Token]:
79+
pos = 0
80+
while pos < len(input):
81+
if input[pos] in (" ", "\t"):
82+
pos += 1
83+
elif input[pos] == "(":
84+
yield Token(TokenType.LPAREN, "(", pos)
85+
pos += 1
86+
elif input[pos] == ")":
87+
yield Token(TokenType.RPAREN, ")", pos)
88+
pos += 1
89+
else:
90+
match = re.match(r"(:?\w|:|\+|-|\.|\[|\])+", input[pos:])
91+
if match:
92+
value = match.group(0)
93+
if value == "or":
94+
yield Token(TokenType.OR, value, pos)
95+
elif value == "and":
96+
yield Token(TokenType.AND, value, pos)
97+
elif value == "not":
98+
yield Token(TokenType.NOT, value, pos)
99+
else:
100+
yield Token(TokenType.IDENT, value, pos)
101+
pos += len(value)
102+
else:
103+
raise ParseError(
104+
pos + 1, 'unexpected character "{}"'.format(input[pos]),
105+
)
106+
yield Token(TokenType.EOF, "", pos)
107+
108+
def accept(self, type: TokenType, *, reject: bool = False) -> Optional[Token]:
109+
if self.current.type is type:
110+
token = self.current
111+
if token.type is not TokenType.EOF:
112+
self.current = next(self.tokens)
113+
return token
114+
if reject:
115+
self.reject((type,))
116+
return None
117+
118+
def reject(self, expected: Sequence[TokenType]) -> "NoReturn":
119+
raise ParseError(
120+
self.current.pos + 1,
121+
"expected {}; got {}".format(
122+
" OR ".join(type.value for type in expected), self.current.type.value,
123+
),
124+
)
125+
126+
127+
def expression(s: Scanner, matcher: Callable[[str], bool]) -> bool:
128+
if s.accept(TokenType.EOF):
129+
return False
130+
ret = expr(s, matcher)
131+
s.accept(TokenType.EOF, reject=True)
132+
return ret
133+
134+
135+
def expr(s: Scanner, matcher: Callable[[str], bool]) -> bool:
136+
ret = and_expr(s, matcher)
137+
while s.accept(TokenType.OR):
138+
rhs = and_expr(s, matcher)
139+
ret = ret or rhs
140+
return ret
141+
142+
143+
def and_expr(s: Scanner, matcher: Callable[[str], bool]) -> bool:
144+
ret = not_expr(s, matcher)
145+
while s.accept(TokenType.AND):
146+
rhs = not_expr(s, matcher)
147+
ret = ret and rhs
148+
return ret
149+
150+
151+
def not_expr(s: Scanner, matcher: Callable[[str], bool]) -> bool:
152+
if s.accept(TokenType.NOT):
153+
return not not_expr(s, matcher)
154+
if s.accept(TokenType.LPAREN):
155+
ret = expr(s, matcher)
156+
s.accept(TokenType.RPAREN, reject=True)
157+
return ret
158+
ident = s.accept(TokenType.IDENT)
159+
if ident:
160+
return matcher(ident.value)
161+
s.reject((TokenType.NOT, TokenType.LPAREN, TokenType.IDENT))
162+
163+
164+
def evaluate(input: str, matcher: Callable[[str], bool]) -> bool:
165+
"""Evaluate a match expression as used by -k and -m.
166+
167+
:param input: The input expression - one line.
168+
:param matcher: Given an identifier, should return whether it matches or not.
169+
Should be prepared to handle arbitrary strings as input.
170+
171+
Returns whether the entire expression matches or not.
172+
"""
173+
return expression(Scanner(input), matcher)

src/_pytest/mark/legacy.py

Lines changed: 25 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -2,44 +2,46 @@
22
this is a place where we put datastructures used by legacy apis
33
we hope to remove
44
"""
5-
import keyword
65
from typing import Set
76

87
import attr
98

109
from _pytest.compat import TYPE_CHECKING
1110
from _pytest.config import UsageError
11+
from _pytest.mark.expression import evaluate
12+
from _pytest.mark.expression import ParseError
1213

1314
if TYPE_CHECKING:
1415
from _pytest.nodes import Item # noqa: F401 (used in type string)
1516

1617

1718
@attr.s
18-
class MarkMapping:
19-
"""Provides a local mapping for markers where item access
20-
resolves to True if the marker is present. """
19+
class MarkMatcher:
20+
"""A matcher for markers which are present."""
2121

2222
own_mark_names = attr.ib()
2323

2424
@classmethod
25-
def from_item(cls, item):
25+
def from_item(cls, item) -> "MarkMatcher":
2626
mark_names = {mark.name for mark in item.iter_markers()}
2727
return cls(mark_names)
2828

29-
def __getitem__(self, name):
29+
def __call__(self, name: str) -> bool:
3030
return name in self.own_mark_names
3131

3232

3333
@attr.s
34-
class KeywordMapping:
35-
"""Provides a local mapping for keywords.
36-
Given a list of names, map any substring of one of these names to True.
34+
class KeywordMatcher:
35+
"""A matcher for keywords.
36+
37+
Given a list of names, matches any substring of one of these names. The
38+
string inclusion check is case-insensitive.
3739
"""
3840

3941
_names = attr.ib(type=Set[str])
4042

4143
@classmethod
42-
def from_item(cls, item: "Item") -> "KeywordMapping":
44+
def from_item(cls, item: "Item") -> "KeywordMatcher":
4345
mapped_names = set()
4446

4547
# Add the names of the current item and any parent items
@@ -62,12 +64,7 @@ def from_item(cls, item: "Item") -> "KeywordMapping":
6264

6365
return cls(mapped_names)
6466

65-
def __getitem__(self, subname: str) -> bool:
66-
"""Return whether subname is included within stored names.
67-
68-
The string inclusion check is case-insensitive.
69-
70-
"""
67+
def __call__(self, subname: str) -> bool:
7168
subname = subname.lower()
7269
names = (name.lower() for name in self._names)
7370

@@ -77,18 +74,17 @@ def __getitem__(self, subname: str) -> bool:
7774
return False
7875

7976

80-
python_keywords_allowed_list = ["or", "and", "not"]
81-
82-
83-
def matchmark(colitem, markexpr):
77+
def matchmark(colitem, markexpr: str) -> bool:
8478
"""Tries to match on any marker names, attached to the given colitem."""
8579
try:
86-
return eval(markexpr, {}, MarkMapping.from_item(colitem))
87-
except Exception:
88-
raise UsageError("Wrong expression passed to '-m': {}".format(markexpr))
80+
return evaluate(markexpr, MarkMatcher.from_item(colitem))
81+
except ParseError as e:
82+
raise UsageError(
83+
"Wrong expression passed to '-m': {}: {}".format(markexpr, e)
84+
) from None
8985

9086

91-
def matchkeyword(colitem, keywordexpr):
87+
def matchkeyword(colitem, keywordexpr: str) -> bool:
9288
"""Tries to match given keyword expression to given collector item.
9389
9490
Will match on the name of colitem, including the names of its parents.
@@ -97,20 +93,9 @@ def matchkeyword(colitem, keywordexpr):
9793
Additionally, matches on names in the 'extra_keyword_matches' set of
9894
any item, as well as names directly assigned to test functions.
9995
"""
100-
mapping = KeywordMapping.from_item(colitem)
101-
if " " not in keywordexpr:
102-
# special case to allow for simple "-k pass" and "-k 1.3"
103-
return mapping[keywordexpr]
104-
elif keywordexpr.startswith("not ") and " " not in keywordexpr[4:]:
105-
return not mapping[keywordexpr[4:]]
106-
for kwd in keywordexpr.split():
107-
if keyword.iskeyword(kwd) and kwd not in python_keywords_allowed_list:
108-
raise UsageError(
109-
"Python keyword '{}' not accepted in expressions passed to '-k'".format(
110-
kwd
111-
)
112-
)
11396
try:
114-
return eval(keywordexpr, {}, mapping)
115-
except Exception:
116-
raise UsageError("Wrong expression passed to '-k': {}".format(keywordexpr))
97+
return evaluate(keywordexpr, KeywordMatcher.from_item(colitem))
98+
except ParseError as e:
99+
raise UsageError(
100+
"Wrong expression passed to '-k': {}: {}".format(keywordexpr, e)
101+
) from None

0 commit comments

Comments
 (0)