Skip to content

Incremental bug involving aliases and/or nested modules or classes and/or import cycles #2133

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
gvanrossum opened this issue Sep 14, 2016 · 14 comments · Fixed by #2237
Closed

Comments

@gvanrossum
Copy link
Member

gvanrossum commented Sep 14, 2016

I'm encountering a complex bug in incremental. One of the ways it appears is spurious errors complaining about some types in the requests stubs, e.g.

/Users/guido/src/mypy/typeshed/third_party/2.7/requests/__init__.pyi:1: note: In module imported here,
(somewhere)/main.py:1: note: ... from here:
/Users/guido/src/mypy/typeshed/third_party/2.7/requests/sessions.pyi: note: At top level:
/Users/guido/src/mypy/typeshed/third_party/2.7/requests/sessions.pyi:59: error: Invalid type "requests.sessions.Request"
/Users/guido/src/mypy/typeshed/third_party/2.7/requests/sessions.pyi:61: error: Invalid type "requests.sessions.Request"
/Users/guido/src/mypy/typeshed/third_party/2.7/requests/sessions.pyi:68: error: Invalid type "requests.sessions.RequestsCookieJar"
/Users/guido/src/mypy/typeshed/third_party/2.7/requests/sessions.pyi: note: In function "request":
/Users/guido/src/mypy/typeshed/third_party/2.7/requests/sessions.pyi:75: error: Invalid type "requests.sessions.RequestsCookieJar"
/Users/guido/src/mypy/typeshed/third_party/2.7/requests/sessions.pyi:75: error: Invalid type "requests.sessions.Request"

There's no need to panic but I'd like to have an issue I can point to.

@gvanrossum gvanrossum added this to the 0.4.x milestone Sep 14, 2016
@gvanrossum
Copy link
Member Author

The issue appears elusive. I wonder if it's related to #2126 (the second bullet)? I may have run mypy with strict optional and that may have corrupted the cache.

@gvanrossum
Copy link
Member Author

(Issue #2126 looking less likely now that I've seen the exact same set of errors from two different repos.)

@gvanrossum
Copy link
Member Author

This is almost certainly related to import cycles; the error "Invalid type" almost exclusively stems from those.

@gvanrossum
Copy link
Member Author

I've got a repro!

bash-3.2$ my -2 -i -m requests.sessions
bash-3.2$ touch typeshed/third_party/2.7/requests/sessions.pyi 
bash-3.2$ my -2 -i -m requests.sessions
typeshed/third_party/2.7/requests/sessions.pyi:59: error: Invalid type "requests.sessions.Request"
typeshed/third_party/2.7/requests/sessions.pyi:61: error: Invalid type "requests.sessions.Request"
typeshed/third_party/2.7/requests/sessions.pyi:68: error: Invalid type "requests.sessions.RequestsCookieJar"
typeshed/third_party/2.7/requests/sessions.pyi: note: In function "request":
typeshed/third_party/2.7/requests/sessions.pyi:75: error: Invalid type "requests.sessions.RequestsCookieJar"
typeshed/third_party/2.7/requests/sessions.pyi:75: error: Invalid type "requests.sessions.Request"

When run with -v, the second run contains the log line

LOG:  Metadata abandoned for requests.sessions: file /Users/guido/src/mypy/typeshed/third_party/2.7/requests/sessions.pyi is modified

(which is correct, since that file was touched). This is a clue that something not quite right has ended up in the cache. I'll investigate...

@gvanrossum
Copy link
Member Author

gvanrossum commented Oct 7, 2016

Shorter repro, involving a package with just three minimal files:

==> requests/__init__.pyi <==
from . import sessions

==> requests/models.pyi <==
class Request:
    pass

==> requests/sessions.pyi <==
from . import models
Request = models.Request
a = ... # type: Request

Then run

rm -rf .mypy_cache
mypy -i -m requests.sessions
touch requests/sessions.pyi
mypy -i -m requests.sessions

This produces one error message on the second run:

requests/sessions.pyi:3: error: Invalid type "requests.sessions.Request"

@gvanrossum
Copy link
Member Author

Some more of the story:

  • With a cold cache, when SemanticAnalyzer.visit_import_from() processes the line "from . import models" in sessions.py, it finds self.modules['requests'] existing, and makes the models variable a MODULE_REF to requests.models.
  • But with a warm cache, at that point self.modules doesn't have the key 'requests' yet, and it skips this code, instead calling self.add_unknown_symbol(). This then cascades into the error message.
  • The reason appears that with a cold cache, requests/__init__.pyi gets parsed and modules['requests'] gets updated early (right after requests/sessions.pyi is parsed). But with a warm cache modules['requests'] gets updated too late, after requests.sessions is already fully processed.

gvanrossum pushed a commit that referenced this issue Oct 9, 2016
This introduces None entries in modules for those modules known to
exist in the graph.  This fixes the issue at the cost of making the
type of modules Dict[str, Optional[MypyFile]].

I have to think about whether I like this enough; there may be another
solution.  Also, this still needs tests.
@gvanrossum
Copy link
Member Author

I've got a possible fix at https://github.com/python/mypy/tree/fix2133. Not sure whether I like it that much. The solution inserts None entries into the modules dict just so that SemanticAnalyzer.visit_import_from() can tell whether a module is supposed to exist or not.

An alternative fix would be to add the parent package to the submodule's dependencies with a low priority, rather than the current approach of maintaining a separate list of ancestors which are taken into account sometimes but not other times. I haven't tried to code that up yet to see how it would go.

@gvanrossum
Copy link
Member Author

@Michael0x2a Any thoughts about this? It's taken me all day Friday to get this far...

@Michael0x2a
Copy link
Collaborator

@gvanrossum --ah, sorry. I meant to respond sooner, but got side-tracked by grading :(

I'll try and take a look tonight/tomorrow morning.

gvanrossum pushed a commit that referenced this issue Oct 9, 2016
This takes the case "from P import M" (where P is a package and M is a
submodule) and adds a dependency on P to the current module with a
new, super-low priority.

The effect on the example from #2133 is that the requests package gets
forced into an SCC with requests.sessions.  This then makes
SemanticAnalyzer.visit_import_from() take the intended path because
modules['requests'] is populated -- this happens because within an
SCC, if any member is stale, all are processed as stale, and this
causes requests to be parsed before the semantic analysis of
requests.sessions happens.

The cost is that packages are more likely to be treated as cycles.
Then again the cause of the problem was in a sense that the requests
package is one big cycle (the parent package imports many of its
submodules) but we had artificially excluded the dependency from the
children on the parent (see the old version of the long comment
updated by this commit).
@gvanrossum
Copy link
Member Author

The alt fix also misses tests; I am still meditating on which fix I like better.

Note that there would also be a cop-out way to fix this, by making some subtle changes to requests/sessions.pyi. But that would just keep the bug in incremental until the next time someone has a package with a similar structure.

@gvanrossum
Copy link
Member Author

The cost of increased circularity is real. The alt fix puts requests and 7 submodules in a single SCC (requests.cookies requests.utils requests.auth requests.models requests.api requests.adapters requests.sessions requests) where there were previously 8 independent singletons.

@gvanrossum
Copy link
Member Author

There's yet another fix I'd like to try. It's more similar to the first version, making subtle changes to SemanticAnalyzer.visit_import_from() in case a module is deemed to exist but hasn't been processed yet. But instead of pre-populating the modules dict with None values, it should use a separate data structure to indicate whether the module exists or not (e.g. derived from graph.keys()).

gvanrossum pushed a commit that referenced this issue Oct 9, 2016
This takes the case "from P import M" (where P is a package and M is a
submodule) and adds a dependency on P to the current module with a
new, super-low priority.

The effect on the example from #2133 is that the requests package gets
forced into an SCC with requests.sessions.  This then makes
SemanticAnalyzer.visit_import_from() take the intended path because
modules['requests'] is populated -- this happens because within an
SCC, if any member is stale, all are processed as stale, and this
causes requests to be parsed before the semantic analysis of
requests.sessions happens.

The cost is that packages are more likely to be treated as cycles.
Then again the cause of the problem was in a sense that the requests
package is one big cycle (the parent package imports many of its
submodules) but we had artificially excluded the dependency from the
children on the parent (see the old version of the long comment
updated by this commit).
gvanrossum pushed a commit that referenced this issue Oct 10, 2016
Fixes #2133.

This is the third fix for this issue.  I think I've finally nailed it.
The solution is not to bail out early when the parent module is
unavailable -- if all the imports are submodules of a package that's
acceptable (the bug was that in incremental mode it's possible that
the parent is fresh and has not been entered into the modules dict
yet, but the bail-out caused the import to be marked as missing).
@gvanrossum
Copy link
Member Author

I like this fix best! https://github.com/python/mypy/tree/fix2133alt2 (It still needs a test.)

@gvanrossum
Copy link
Member Author

Now with test. I think I'll just merge this.

gvanrossum added a commit that referenced this issue Oct 10, 2016
Fixes #2133.

The solution is not to bail out early when the parent module is
unavailable -- if all the imports are submodules of a package that's
acceptable (the bug was that in incremental mode it's possible that
the parent is fresh and has not been entered into the modules dict
yet, but the bail-out caused the import to be marked as missing).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants