-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Surprising Any in result type of re.Match.group() where None expected #12090
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This is a duplicate of #10680, #9482, #11203, and many others (see also #9465 (comment), python/mypy#16441 (comment), #10526, etc.) |
So... a lot of existing code is doing something dumb, simply not handling the types right, and the decision was taken to make everything type unsafe rather than getting the types correct? That's an absolutely terrible decision. |
No, a lot of code is doing something that is clearly type-safe, and could be statically verified to be type-safe. Unfortunately, the rules that govern when this kind of pattern is actually type-safe or are not currently (and possibly will never be) expressible solely using type annotations, and type annotations are the only thing we have control over here in typeshed. In order to avoid a truly vast number of false positives when type checkers analyse user code, we have chosen in this instance to go for more lenient annotations. |
At the very least, it ought to be overloaded so that asking for a group by integer index says that it will always produce an >>> m = re.search("([abc])d", "abcde")
>>> m
<re.Match object; span=(2, 4), match='cd'>
>>> m.groups()
('c',)
>>> m.group(1)
'c'
>>> m.group(2)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: no such group
>>> m.group("foo")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: no such group A group might not match anything at all and become >>> m = re.search("(([cde])f)?b", "ab")
>>> m
<re.Match object; span=(1, 2), match='b'>
>>> m.groups()
(None, None) but at no point are you going to get, say, an integer or anything like that out of this. I understand that you'd like to be able to say "this particular case can't be a |
it doesn't sound like you've considered optional groups: >>> import re
>>> m = re.search(r"([ab])?d", "abcde")
>>> m.group(1) is None
True |
Next paragraph. 😉 |
Okay, then I'm not sure exactly what you're proposing. If you'd like to make a PR to gauge the possible impact of a change, feel free, but we're pretty unlikely to merge it if it has anything like the number of false positives that e.g. #11203 had :-) |
The problem is that if you say this, it forces everyone to check for Ideally there would be a way to tell a type checker that:
This combination of 3 different requirements is not possible to achieve. If you pick any two of them, it is possible to achieve. We have decided to pick the first two. |
Sorry if I'm a bit irascible. I'm in the middle of trying to typecheck the YAML deserialization mechanism from Heck. |
It's okay. Types can be infuriating sometimes :-) |
I think it might help to show more concretely what we want the annotation to do. Here are our requirements: def foo() -> ???:
...
# Requirement A: this should be error
if foo().starswthi("bar"):
...
# Requirement B: this should succeed
if foo().startswith("bar"):
...
# Requirement C: this should succeed
if foo() is None:
... If you annotate with If you annotate with If you annotate with If you annotate with |
Conceptually however, the property of whether group can be Some user intervention will clearly be required; it's not reasonable for the type checker to analyse regular expressions from arbitrary sources. |
|
True, but then you're outside the reasonable reach of what a type checker should do. Only crazy people like the TypeScript developers try to go further! |
I've been trying to typecheck some code that uses
re.match
but getting some unexpectedAny
s in it that came from these typeshed definitions (especially the second and third ones):Under what circumstance is producing an
Any
at this point a good plan? TheAnyStr
is fine, but surely theAny
should be aNone
? After all, right now it's saying "Oh, we could spontaneously return a lock object from theconcurrency
module because we're feeling bored; no promises!" which is rather at variance with the documentation which says effectively "returns a substring if there is a match group, otherwiseNone
".The text was updated successfully, but these errors were encountered: