Skip to content

Document how plugins can support the new semantic analyzer #7135

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jul 3, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions docs/source/extending_mypy.rst
Original file line number Diff line number Diff line change
Expand Up @@ -235,3 +235,30 @@ insert some entries there) before the class body is analyzed.
module. It is called before semantic analysis. For example, this can
be used if a library has dependencies that are dynamically loaded
based on configuration information.

Supporting the new semantic analyzer
************************************

Support for the new semantic analyzer (enabled through
``--new-semantic-analyzer``) requires some changes to plugins. Here is
a short summary of the most important changes:

* The order of processing AST nodes is different. Code outside
functions is processed first, and functions and methods are
processed afterwards.

* Each AST node can be processed multiple times to resolve forward
references. The same plugin hook may be called multiple times, so
they need to be idempotent.

* The ``anal_type()`` API method returns ``None`` if some part of
the type is not available yet due to forward references, for example.

* When looking up symbols, you may encounter *placeholder nodes* that
are used for names that haven't been fully processed yet. You'll
generally want to request another semantic analysis iteration by
*deferring* in that case.

See the docstring at the top of
`mypy/plugin.py <https://github.com/python/mypy/blob/master/mypy/plugin.py>`_
for more details.
141 changes: 112 additions & 29 deletions mypy/plugin.py
Original file line number Diff line number Diff line change
@@ -1,40 +1,123 @@
"""Plugin system for extending mypy.

At large scale the plugin system works as following:
* Plugins are collected from the corresponding config option
(either a paths to Python files, or installed Python modules)
and imported using importlib

* Plugins are collected from the corresponding mypy config file option
(either via paths to Python files, or installed Python modules)
and imported using importlib.

* Every module should get an entry point function (called 'plugin' by default,
but may be overridden in the config file), that should accept a single string
argument that is a full mypy version (includes git commit hash for dev versions)
and return a subclass of mypy.plugins.Plugin
but may be overridden in the config file) that should accept a single string
argument that is a full mypy version (includes git commit hash for dev
versions) and return a subclass of mypy.plugins.Plugin.

* All plugin class constructors should match the signature of mypy.plugin.Plugin
(i.e. should accept an mypy.options.Options object), and *must* call super().__init__
* At several steps during semantic analysis and type checking mypy calls special `get_xxx`
methods on user plugins with a single string argument that is a full name of a relevant
node (see mypy.plugin.Plugin method docstrings for details)
* The plugins are called in the order they are passed in the config option. Every plugin must
decide whether to act on a given full name. The first plugin that returns non-None object
will be used
(i.e. should accept an mypy.options.Options object), and *must* call
super().__init__().

* At several steps during semantic analysis and type checking mypy calls
special `get_xxx` methods on user plugins with a single string argument that
is a fully qualified name (full name) of a relevant definition
(see mypy.plugin.Plugin method docstrings for details).

* The plugins are called in the order they are passed in the config option.
Every plugin must decide whether to act on a given full name. The first
plugin that returns non-None object will be used.

* The above decision should be made using the limited common API specified by
mypy.plugin.CommonPluginApi
* The callback returned by the plugin will be called with a larger context that includes
relevant current state (e.g. a default return type, or a default attribute type) and
a wider relevant API provider (e.g. SemanticAnalyzerPluginInterface or
CheckerPluginInterface)
* The result of this is used for further processing. See various `XxxContext` named tuples
for details about which information is given to each hook.
mypy.plugin.CommonPluginApi.

* The callback returned by the plugin will be called with a larger context that
includes relevant current state (e.g. a default return type, or a default
attribute type) and a wider relevant API provider (e.g.
SemanticAnalyzerPluginInterface or CheckerPluginInterface).

* The result of this is used for further processing. See various `XxxContext`
named tuples for details about which information is given to each hook.

Plugin developers should ensure that their plugins work well in incremental and
daemon modes. In particular, plugins should not hold global state, and should always call
add_plugin_dependency() in plugin hooks called during semantic analysis, see the method
docstring for more details.

There is no dedicated cache storage for plugins, but plugins can store per-TypeInfo data
in a special .metadata attribute that is serialized to cache between incremental runs.
To avoid collisions between plugins they are encouraged to store their state
under a dedicated key coinciding with plugin name in the metadata dictionary.
Every value stored there must be JSON-serializable.
daemon modes, and with both the old and new semantic analyzers (the old semantic
analyzer will be removed soon). In particular, plugins should not hold global
state, and should always call add_plugin_dependency() in plugin hooks called
during semantic analysis. See the method docstring for more details.

There is no dedicated cache storage for plugins, but plugins can store
per-TypeInfo data in a special .metadata attribute that is serialized to the
mypy caches between incremental runs. To avoid collisions between plugins, they
are encouraged to store their state under a dedicated key coinciding with
plugin name in the metadata dictionary. Every value stored there must be
JSON-serializable.

## New semantic analyzer

The new semantic analyzer (enabled through the --new-semantic-analyzer flag)
changes how plugins are expected to work in several notable ways:

1. The order of processing AST nodes in modules is different. The old semantic
analyzer processes modules in textual order, one module at a time. The new
semantic analyzer first processes the module top levels, including bodies of
any top-level classes and classes nested within classes. ("Top-level" here
means "not nested within a function/method".) Functions and methods are
processed only after module top levels have been finished. If there is an
import cycle, all module top levels in the cycle are processed before
processing any functions or methods. Each unit of processing (a module top
level or a function/method) is called a *target*.

This also means that function signatures in the same module have not been
analyzed yet when analyzing the module top level. If you need access to
a function signature, you'll need to explicitly analyze the signature first
using `anal_type()`.

2. Each target can be processed multiple times. This may happen if some forward
references are not ready yet, for example. This means that semantic analyzer
related plugin hooks can be called multiple times for the same full name.
These plugin methods must thus be idempotent.

3. The `anal_type` API function returns None if some part of the type is not
available yet. If this happens, the current target being analyzed will be
*deferred*, which means that it will be processed again soon, in the hope
that additional dependencies will be available. This may happen if there are
forward references to types or inter-module references to types within an
import cycle.

Note that if there is a circular definition, mypy may decide to stop
processing to avoid an infinite number of iterations. When this happens,
`anal_type` will generate an error and return an `AnyType` type object
during the final iteration (instead of None).

4. There is a new API method `defer()`. This can be used to explicitly request
the current target to be reprocessed one more time. You don't need this
to call this if `anal_type` returns None, however.

5. There is a new API property `final_iteration`, which is true once mypy
detected no progress during the previous iteration or if the maximum
semantic analysis iteration count has been reached. You must never
defer during the final iteration, as it will cause a crash.

6. The `node` attribute of SymbolTableNode objects may contain a reference to
a PlaceholderNode object. This object means that this definition has not
been fully processed yet. If you encounter a PlaceholderNode, you should
defer unless it's the final iteration. If it's the final iteration, you
should generate an error message. It usually means that there's a cyclic
definition that cannot be resolved by mypy. PlaceholderNodes can only refer
to references inside an import cycle. If you are looking up things from
another module, such as the builtins, that is outside the current module or
import cycle, you can safely assume that you won't receive a placeholder.

When testing your plugin with the new semantic analyzer, you should have a test
case that forces a module top level to be processed multiple times. The easiest
way to do this is to include a forward reference to a class in a top-level
annotation. Example:

c: C # Forward reference causes second analysis pass
class C: pass

Note that a forward reference in a function signature won't trigger another
pass, since all functions are processed only after the top level has been fully
analyzed.

You can use `api.options.new_semantic_analyzer` to check whether the new
semantic analyzer is enabled.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this added info be also added to docs/source/extending_mypy.rst?

"""

import types
Expand Down