-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Add a new, lower priority for imports inside "if MYPY" #2167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
fc2f577
8bab47b
a078583
9a3308d
bb26391
dc1f8c2
fa707f6
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -24,7 +24,7 @@ | |
from typing import (AbstractSet, Dict, Iterable, Iterator, List, | ||
NamedTuple, Optional, Set, Tuple, Union) | ||
|
||
from mypy.nodes import (MypyFile, Import, ImportFrom, ImportAll) | ||
from mypy.nodes import (MypyFile, Node, ImportBase, Import, ImportFrom, ImportAll) | ||
from mypy.semanal import FirstPass, SemanticAnalyzer, ThirdPass | ||
from mypy.checker import TypeChecker | ||
from mypy.indirection import TypeIndirectionVisitor | ||
|
@@ -306,10 +306,23 @@ def default_lib_path(data_dir: str, pyversion: Tuple[int, int]) -> List[str]: | |
PRI_HIGH = 5 # top-level "from X import blah" | ||
PRI_MED = 10 # top-level "import X" | ||
PRI_LOW = 20 # either form inside a function | ||
PRI_MYPY = 25 # inside "if MYPY" or "if TYPE_CHECKING" | ||
PRI_INDIRECT = 30 # an indirect dependency | ||
PRI_ALL = 99 # include all priorities | ||
|
||
|
||
def import_priority(imp: ImportBase, toplevel_priority: int) -> int: | ||
"""Compute import priority from an import node.""" | ||
if not imp.is_top_level: | ||
# Inside a function | ||
return PRI_LOW | ||
if imp.is_mypy_only: | ||
# Inside "if MYPY" or "if typing.TYPE_CHECKING" | ||
return max(PRI_MYPY, toplevel_priority) | ||
# A regular import; priority determined by argument. | ||
return toplevel_priority | ||
|
||
|
||
# TODO: Get rid of all_types. It's not used except for one log message. | ||
# Maybe we could instead publish a map from module ID to its type_map. | ||
class BuildManager: | ||
|
@@ -395,20 +408,21 @@ def correct_rel_imp(imp: Union[ImportFrom, ImportAll]) -> str: | |
for imp in file.imports: | ||
if not imp.is_unreachable: | ||
if isinstance(imp, Import): | ||
pri = PRI_MED if imp.is_top_level else PRI_LOW | ||
pri = import_priority(imp, PRI_MED) | ||
ancestor_pri = import_priority(imp, PRI_LOW) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't understand the story of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's only used once, for the ancestors. The reason is that if you have "import x.y" you depend on both x.y and on x, but the priority for x should be less than for x.y. There are many other ways we could potentially improve the heuristics here but they're still just heuristics, so I don't really care. The change here is really just that if we decide an import is inside |
||
for id, _ in imp.ids: | ||
ancestor_parts = id.split(".")[:-1] | ||
ancestors = [] | ||
for part in ancestor_parts: | ||
ancestors.append(part) | ||
res.append((PRI_LOW, ".".join(ancestors), imp.line)) | ||
res.append((ancestor_pri, ".".join(ancestors), imp.line)) | ||
res.append((pri, id, imp.line)) | ||
elif isinstance(imp, ImportFrom): | ||
cur_id = correct_rel_imp(imp) | ||
pos = len(res) | ||
all_are_submodules = True | ||
# Also add any imported names that are submodules. | ||
pri = PRI_MED if imp.is_top_level else PRI_LOW | ||
pri = import_priority(imp, PRI_MED) | ||
for name, __ in imp.names: | ||
sub_id = cur_id + '.' + name | ||
if self.is_module(sub_id): | ||
|
@@ -421,10 +435,10 @@ def correct_rel_imp(imp: Union[ImportFrom, ImportAll]) -> str: | |
# cur_id is also a dependency, and we should | ||
# insert it *before* any submodules. | ||
if not all_are_submodules: | ||
pri = PRI_HIGH if imp.is_top_level else PRI_LOW | ||
pri = import_priority(imp, PRI_HIGH) | ||
res.insert(pos, ((pri, cur_id, imp.line))) | ||
elif isinstance(imp, ImportAll): | ||
pri = PRI_HIGH if imp.is_top_level else PRI_LOW | ||
pri = import_priority(imp, PRI_HIGH) | ||
res.append((pri, correct_rel_imp(imp), imp.line)) | ||
|
||
return res | ||
|
@@ -1703,8 +1717,8 @@ def order_ascc(graph: Graph, ascc: AbstractSet[str], pri_max: int = PRI_ALL) -> | |
each SCC thus found. The recursion is bounded because at each | ||
recursion the spread in priorities is (at least) one less. | ||
|
||
In practice there are only a few priority levels (currently | ||
N=3) and in the worst case we just carry out the same algorithm | ||
In practice there are only a few priority levels (less than a | ||
dozen) and in the worst case we just carry out the same algorithm | ||
for finding SCCs N times. Thus the complexity is no worse than | ||
the complexity of the original SCC-finding algorithm -- see | ||
strongly_connected_components() below for a reference. | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -84,10 +84,20 @@ | |
T = TypeVar('T') | ||
|
||
|
||
# Inferred value of an expression. | ||
ALWAYS_TRUE = 0 | ||
ALWAYS_FALSE = 1 | ||
TRUTH_VALUE_UNKNOWN = 2 | ||
# Inferred truth value of an expression. | ||
ALWAYS_TRUE = 1 | ||
MYPY_TRUE = 2 # True in mypy, False at runtime | ||
ALWAYS_FALSE = 3 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Minor comments: uniform prefix will make it slightly easier to search and possibly replace, as will happen eventually when it will become an Enum. Not that TRUTH_VALUE_UNKNOWN is a hard name to search :) How about giving this type a dedicated alias until then? making it under the namespace There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, but that would just cause more code churn and the benefit would be minuscule. |
||
MYPY_FALSE = 4 # False in mypy, True at runtime | ||
TRUTH_VALUE_UNKNOWN = 5 | ||
|
||
inverted_truth_mapping = { | ||
ALWAYS_TRUE: ALWAYS_FALSE, | ||
ALWAYS_FALSE: ALWAYS_TRUE, | ||
TRUTH_VALUE_UNKNOWN: TRUTH_VALUE_UNKNOWN, | ||
MYPY_TRUE: MYPY_FALSE, | ||
MYPY_FALSE: MYPY_TRUE, | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do you consider this numbering more readable than the range [-2, 2], with the obvious negation? An explicit mapping can still be used just to be sure. TRUTH_UNKNOWN should be zero this way or the other, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The numbers are entirely uninteresting. I intentionally changed them around a bit to show that this is so. |
||
|
||
# Map from obsolete name to the current spelling. | ||
obsolete_name_mapping = { | ||
|
@@ -3082,12 +3092,16 @@ def infer_reachability_of_if_statement(s: IfStmt, | |
platform: str) -> None: | ||
for i in range(len(s.expr)): | ||
result = infer_if_condition_value(s.expr[i], pyversion, platform) | ||
if result == ALWAYS_FALSE: | ||
# The condition is always false, so we skip the if/elif body. | ||
if result in (ALWAYS_FALSE, MYPY_FALSE): | ||
# The condition is considered always false, so we skip the if/elif body. | ||
mark_block_unreachable(s.body[i]) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just to make sure, the intention is that In this case if TYPE_CHECKING:
import expensive_mod
else:
# I don't care about runtime annotations
expensive_mod = CleverPseudoModule('SomeClass')
def a_func(arg: expensive_mod.SomeClass) -> None:
... The else branch will not be checked at all. Again, it's fine, but I think it should be documented. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Even though PEP 484 only gives a positive example, it clearly states that TYPE_CHECKING
I don't see any ambiguity there. |
||
elif result == ALWAYS_TRUE: | ||
# This condition is always true, so all of the remaining | ||
# elif/else bodies will never be executed. | ||
elif result in (ALWAYS_TRUE, MYPY_TRUE): | ||
# This condition is considered always true, so all of the remaining | ||
# elif/else bodies should not be checked. | ||
if result == MYPY_TRUE: | ||
# This condition is false at runtime; this will affect | ||
# import priorities. | ||
mark_block_mypy_only(s.body[i]) | ||
for body in s.body[i + 1:]: | ||
mark_block_unreachable(body) | ||
if s.else_body: | ||
|
@@ -3099,7 +3113,8 @@ def infer_if_condition_value(expr: Expression, pyversion: Tuple[int, int], platf | |
"""Infer whether if condition is always true/false. | ||
|
||
Return ALWAYS_TRUE if always true, ALWAYS_FALSE if always false, | ||
and TRUTH_VALUE_UNKNOWN otherwise. | ||
MYPY_TRUE if true under mypy and false at runtime, MYPY_FALSE if | ||
false under mypy and true at runtime, else TRUTH_VALUE_UNKNOWN. | ||
""" | ||
name = '' | ||
negated = False | ||
|
@@ -3123,12 +3138,9 @@ def infer_if_condition_value(expr: Expression, pyversion: Tuple[int, int], platf | |
elif name == 'PY3': | ||
result = ALWAYS_TRUE if pyversion[0] == 3 else ALWAYS_FALSE | ||
elif name == 'MYPY' or name == 'TYPE_CHECKING': | ||
result = ALWAYS_TRUE | ||
result = MYPY_TRUE | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I prefer There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Well, we don't really care about other type checkers in this codebase, do we? :-) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No. My concern was greppability, but it is a minor one (it's only the connection between MYPY_TRUE and TYPE_CHECKING that is not immediate; each is perfectly greppable on its own). |
||
if negated: | ||
if result == ALWAYS_TRUE: | ||
result = ALWAYS_FALSE | ||
elif result == ALWAYS_FALSE: | ||
result = ALWAYS_TRUE | ||
result = inverted_truth_mapping[result] | ||
return result | ||
|
||
|
||
|
@@ -3303,6 +3315,23 @@ def visit_import_all(self, node: ImportAll) -> None: | |
node.is_unreachable = True | ||
|
||
|
||
def mark_block_mypy_only(block: Block) -> None: | ||
block.accept(MarkImportsMypyOnlyVisitor()) | ||
|
||
|
||
class MarkImportsMypyOnlyVisitor(TraverserVisitor): | ||
"""Visitor that sets is_mypy_only (which affects priority).""" | ||
|
||
def visit_import(self, node: Import) -> None: | ||
node.is_mypy_only = True | ||
|
||
def visit_import_from(self, node: ImportFrom) -> None: | ||
node.is_mypy_only = True | ||
|
||
def visit_import_all(self, node: ImportAll) -> None: | ||
node.is_mypy_only = True | ||
|
||
|
||
def is_identity_signature(sig: Type) -> bool: | ||
"""Is type a callable of form T -> T (where T is a type variable)?""" | ||
if isinstance(sig, CallableType) and sig.arg_kinds == [ARG_POS]: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (Completely off topic but such comparisons assumes that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, we have a bug open about that already. |
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1313,3 +1313,27 @@ pass | |
[file b] | ||
pass | ||
[out] | ||
|
||
[case testTypeCheckPrio] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does this test case fail without the changes in this PR? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes |
||
# cmd: mypy -m part1 part2 part3 part4 | ||
|
||
[file part1.py] | ||
from part3 import Thing | ||
class FirstThing: pass | ||
|
||
[file part2.py] | ||
from part4 import part4_thing as Thing | ||
|
||
[file part3.py] | ||
from part2 import Thing | ||
reveal_type(Thing) | ||
|
||
[file part4.py] | ||
from typing import TYPE_CHECKING | ||
if TYPE_CHECKING: | ||
from part1 import FirstThing | ||
def part4_thing(a: int) -> str: pass | ||
|
||
[builtins fixtures/bool.pyi] | ||
[out] | ||
tmp/part3.py:2: error: Revealed type is 'def (a: builtins.int) -> builtins.str' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why only a single additional priority? Looks like the concerns are pretty independent. I would think a more detailed set of priorities, e.g. lexicographical ordering of the form
might help the algorithm breaking cycles less arbitrarily, Alternatively,
perhaps even more so (the actual tuple is of course unimportant; I only talk about the implied ordering).
I don't know if all this will have any effect on any real codebase, but the current system feels somewhat arbitrary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would still be arbitrary. There are ideas for something better in the tracker, but I'd like to see how things work out with this small improvement first.