Skip to content

mypy not coping well with a variable changing type #2589

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rhettinger opened this issue Dec 17, 2016 · 6 comments
Closed

mypy not coping well with a variable changing type #2589

rhettinger opened this issue Dec 17, 2016 · 6 comments

Comments

@rhettinger
Copy link

Sometimes, it is necessary to accumulate data using a mutable structure and then recast the data into an immutable object suitable for use as a dictionary key. If the transformation switching data types while using the same variable name, mypy emits an inscrutable error message that doesn't hint at what the actual problem is.

The solution was to use another variable name for the transmuted data.

Recommendation: Let variables be rebound to different types as needed. Barring that, provide a better error message.

Simplified example:

from typing import List, DefaultDict, Tuple, Dict
from collections import defaultdict
from pprint import pprint

scores = [95, 86, 64, 92, 71, 68, 79, 92]

Decile = int
Score = int

deciles = defaultdict(list)                       # type: DefaultDict[Decile, List[Score]]
for score in scores:
    deciles[score // 10].append(score)

# turn deciles into a plain dict mapping to tuples
deciles = {decile : tuple(group) for decile, group in deciles.items()}  # type: Dict[Decile, Tuple[Score]]

pprint(deciles)

The error message is:

tmp.py:10: error: Argument 1 to "defaultdict" has incompatible type List[_T]; expected Callable[[], Tuple[int]]
tmp.py:12: error: "Tuple[int]" has no attribute "append"
tmp.py:15: error: Name 'deciles' already defined
tmp.py:15: error: Value expression in dictionary comprehension has incompatible type Tuple[int, ...]; expected type "Tuple[int]"

Note, the message for line 10 makes no sense until you see the later message for line 15. In my client's code base, these two we very far apart and made it difficult to figure out why line 10 gets an error message even though it is correct code.

The solution was to change the last lines to use new variable names:

# turn deciles into a plain dict mapping to tuples
newdeciles = {decile : tuple(group) for decile, group in deciles.items()}  # type: Dict[Decile, Tuple[Score]]

pprint(newdeciles)

It would be nice if the error message applied only to the redefinition line rather than the initial correct declaration. It would be even nicer if mypy supported patterns that allowed a tranformation in-place or re-use of a variable name for cases like converting a list of lists into a list of tuples. The latter example is not uncommon in my client's code base (because the inner lists need to be used as dict keys).

@rwbarton
Copy link
Contributor

Regardless of whether we allow changing the type of a variable, the errors reported on lines 10 and 12 are a bug. (And I agree it's especially confusing when the extraneous errors appear before the real issue.)

Changing the declared type of a variable can get tricky. What if your type-changing assignment to deciles appeared inside one branch of an if statement, or inside a loop?

There's also an argument that it's bad style to have a variable hold different types of value over its lifetime; though I tend to think that's less true if the change in type is marked by an explicit new type annotation.

@ambv
Copy link
Contributor

ambv commented Dec 19, 2016

@rwbarton, the errors on lines 10 and 12 are misleading but the bug is that we need a better message, not that they appear.

@rhettinger, the issue here is that when Mypy is parsing the source to assign types to names, it only ever holds a single slot for a name within a given scope. It reads from top to bottom so the last type declared wins. Type checking is performed later, thus the confusing error message.

Mutating the type of a name within a single scope is considered an error. As you admit, if changes to the type are very far apart, the reader of the code might get the wrong idea about a name's type when making changes or using an API later.

The simplest example of this is:

a = 1  # type: int
a = 'a'  # type: str
a = []  # type: List[int]

Mypy will currently output the following errors:

f:1: error: Incompatible types in assignment (expression has type "int", variable has type List[int])
f:2: error: Name 'a' already defined
f:2: error: Incompatible types in assignment (expression has type "str", variable has type List[int])
f:3: error: Name 'a' already defined

Note what's going on: it correctly warns that the type was redefined within a single scope. But it also confusingly warns that lines 1 and 2 have incorrect types, since at this point the recorded type for the name a is List[str] (last one wins). I think it would be a worthwhile improvement to Mypy to have better error messages in this case.

I have two suggestions:

  1. Either: consider using a separate name for the other type, making things more explicit and easier to audit by readers and reviewers (and automated tools, wink wink nudge nudge!).

  2. Or: admit that within the given scope, the type of the variable is "either mutable or immutable" and use a Union in the first declaration. So the example above becomes:

from typing import List, Union, cast

a = 1  # type: Union[int, str, List[int]]
a = cast(int, a)
...
a = 'a'
...
a = []

This doesn't produce any Mypy errors. Why bother with casting? Well, usually Mypy is smart enough to infer specializations of the types within the flow of the code but in the a = 1 we explicitly told it that it can be a Union so it believes us. The cast after the first assignment clears up the confusion, also for the reader.

Consider usage of the types:

from typing import List, Union, cast

a = 1  # type: Union[int, str, List[int]]
a = cast(int, a)
a.bit_length()  # only valid on ints, no error
...
a = 'a'
a.title()  # only valid on strings, no error
...
a = []
a.append('1')  # only valid on lists but wrong argument type, error!
a.title()  # only valid on strings, error!

As annotated, Mypy will produce the following output here:

f:11: error: Argument 1 to "append" of "list" has incompatible type "str"; expected "int"
f:12: error: List[int] has no attribute "title"

All of those are valid and helpful error messages! Note that if you comment out the cast(), Mypy will believe this is some magic dynamic code that really can be "either-or" for now, so it will also produce the following error:

f:5: error: Some element of union has no attribute "bit_length"

Does that clear things up?

@rwbarton
Copy link
Contributor

@rwbarton, the errors on lines 10 and 12 are misleading but the bug is that we need a better message, not that they appear.

I don't agree, because...

@rhettinger, the issue here is that when Mypy is parsing the source to assign types to names, it only ever holds a single slot for a name within a given scope. It reads from top to bottom so the last type declared wins. Type checking is performed later, thus the confusing error message.

... the subsequent type declarations are erroneous, so mypy ought to just ignore them.

It's very odd if lines 10 and 12 are okay, but then after adding a wrong line on line 15, line 12 is no longer okay. And there's no need for it; we should just report the error on line 15.

@ambv
Copy link
Contributor

ambv commented Dec 19, 2016

@rwbarton, so you're essentially suggesting that in cases where "Name already defined" errors are raised, Mypy should not record the new type and stick to the first type given? This is interesting, I'd need to look closer at the wider consequences of a change like this.

@ilevkivskyi
Copy link
Member

... the subsequent type declarations are erroneous, so mypy ought to just ignore them.

I agree with @rwbarton. If we prohibit the redefinition/re-declaration then we should keep the first declared type, not the last one.

@ambv
Copy link
Contributor

ambv commented Dec 19, 2016

Suggested fix for the spurious warnings is in PR #2591. My advice on how to solve the "Name already defined" errors (the two suggestions) still stands.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants