Skip to content

Show code snippets on demand #7440

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Sep 4, 2019
Merged

Conversation

ilevkivskyi
Copy link
Member

This essentially fixes #7411

Couple ideas I think may make the much larger amount of output more readable are:

  • Use dim color for the source code snippet
  • Wrap long error messages so that they are not messed up with file names etc.

An example output (dark terminal on Linux):

Screenshot from 2019-08-31 18:26:49

@ilevkivskyi
Copy link
Member Author

I just discovered that this doesn't work well with blocking errors and/or daemon. I have a fix but it requires mypy.errors depend on mypy.fscache. Which is probably not so bad.

@ilevkivskyi ilevkivskyi requested a review from JukkaL September 2, 2019 11:20
@JukkaL
Copy link
Collaborator

JukkaL commented Sep 2, 2019

Some general comments, most of which are copied from #7411:

I think that we should fix at least the most obvious problems with column numbers before documenting this as a feature (it's okay to have it as a hidden option though). I have a half-finished PR that fixes column numbers of arguments and may be able to finish that up soon. Also, assignments should probably point to the rvalue.

Other comments:

  • I don't like the new location for the error code. I'd keep that in the original location after the message.
  • It might look better if we'd only underline the target token, i.e. the length of the underlining would vary. Maybe the underlining should cover a simple member expression (say, underline foo.bar in foo.bar.zar).
  • If the column number is missing, we should perhaps point to the first non-space token on the line. There should be a test for this.
  • The indentation when splitting error message may be too much, since the path can be very long. Maybe use a fixed level of indent (8 spaces?).

@JukkaL
Copy link
Collaborator

JukkaL commented Sep 2, 2019

I have a fix but it requires mypy.errors depend on mypy.fscache.

Interesting, why does is need fscache?

@ilevkivskyi
Copy link
Member Author

@JukkaL

I don't like the new location for the error code. I'd keep that in the original location after the message.

I actually like the new position more, since we are already taking much more vertical space, why not save a bit of horizontal space? Otherwise the line with wiggly arrow will have few information on it.

It might look better if we'd only underline the target token, i.e. the length of the underlining would vary.

It is hard to define what is "token". It seems to me you don't realize how hard would be implementing this to a reasonable degree of reliability and how misleading the positions can be if we will do this in ad-hoc manner. Like we may end up underlining the first item of a tuple while the problem is actually in the type of the second one. I would rather focus on making the start column better first.

If the column number is missing, we should perhaps point to the first non-space token on the line. There should be a test for this.

OK.

The indentation when splitting error message may be too much, since the path can be very long. Maybe use a fixed level of indent (8 spaces?).

OK.

Interesting, why does is need fscache?

I want to pass the cache as an argument to the Errors constructor (literally once), rather that "threading" the error line content to the every place it is needed. Therefore I need to annotate the constructor argument. I however just realized I can use a tiny protocol.

Copy link
Collaborator

@JukkaL JukkaL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a more detailed review of the code (my previous comments were about the formatting of the output). I'm excited that this will be improve usability by making it easier to figure out why mypy is complaining about something.

mypy/errors.py Outdated
decode_python_encoding, DecodeError, trim_source_line, DEFAULT_SOURCE_OFFSET,
WIGGLY_LINE, DEFAULT_COLUMNS, MINIMUM_WIDTH
)
from mypy.fscache import FileSystemCache
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be nice if there was no import from mypy.fscache in this file, and instead we'd accept a callback that reads file contents, for example (dependency injection). We could probably also get rid of the decode_python_encoding and DecodeError imports.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I was thinking about a protocol, but callback is even better.

mypy/errors.py Outdated
self.show_column_numbers = show_column_numbers
self.show_error_codes = show_error_codes
def __init__(self, fscache: Optional[FileSystemCache] = None,
options: Optional[Options] = None) -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds a big dependency, which we previously avoided by providing the required information explicitly. It would arguably be cleaner if you'd just add the required options as arguments here instead of passing the whole Options object.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I will split this back.

mypy/errors.py Outdated
source_lines = None
if self.options and self.options.show_source_code:
if self.fscache:
try:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This try statement could be provided as a callback to avoid a depending on fscache.

@@ -20,6 +21,15 @@
ENCODING_RE = \
re.compile(br'([ \t\v]*#.*(\r\n?|\n))??[ \t\v]*#.*coding[:=][ \t]*([-\w.]+)') # type: Final

PLAIN_ANSI_DIM = '\x1b[2m' # type: Final
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this portable? Could we get this from curses or something?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried it in couple terminals and it works (because it is ANSI standard). The problem is that although it is a basic ANSI "feature", terminfo files for most default terminals don't have dim termcap entry, so curses doesn't report it. I was thinking about choosing a grey color that would look good on both white and black background, but it is actually not easy, and again most default terminals are 8-color, not 256-color, so we can't get the color code from curses.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, sounds reasonable. Maybe add a comment about this?

Since this is not on by default, we can iterate on this afterwards even if some things don't work in every possible environment.

@@ -110,6 +119,24 @@ def decode_python_encoding(source: bytes, pyversion: Tuple[int, int]) -> str:
return source_text


def trim_source_line(line: str, max_len: int, col: int, min_width: int) -> Tuple[str, int]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is a great candidate for regular unit tests.

mypy/util.py Outdated
return start + self.colors[color] + text + self.NORMAL

def colorize(self, error: str) -> str:
"""Colorize an output line by highlighting the status and error code."""
if ': error:' in error:
loc, msg = error.split('error:', maxsplit=1)
if not self.show_error_codes:
if self.show_source_code:
# Improve readability by wrapping lines when showing source code.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's can be surprising that showing source code also affects how long messages are wrapped. Maybe the option should be named --pretty or something, as it controls a few things.

mypy/util.py Outdated
@@ -401,10 +492,18 @@ def colorize(self, error: str) -> str:
elif ': note:' in error:
loc, msg = error.split('note:', maxsplit=1)
return loc + self.style('note:', 'blue') + self.underline_link(msg)
elif self.show_source_code and error.startswith(' ' * DEFAULT_SOURCE_OFFSET):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Detecting source code highlights through an indent prefix can be surprising. I wonder if there's a more general way to do this. If there's no alternative simple approach, this should be documented carefully somewhere.

mypy/util.py Outdated
return res


def get_term_columns() -> int:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about renaming this to get_terminal_width?

mypy/errors.py Outdated
if add_snippets:
if severity == 'error' and source_lines and line > 0:
source_line, offset = trim_source_line(source_lines[line - 1],
DEFAULT_COLUMNS, column, MINIMUM_WIDTH)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this use the terminal width?

mypy/util.py Outdated
if self.show_source_code:
# Improve readability by wrapping lines when showing source code.
pad = len(loc) + len('error: ')
max_len = get_term_columns() - pad - 1 # compensate for space after 'error:'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is kind of suspicious -- how will this work in the daemon? Should the client pass the terminal width to the server as part of each request, since the terminal may change between invocations?

Anyway, it might be better if this was provided by the caller.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is kind of suspicious -- how will this work in the daemon?

I was going to fix this after the other PR lands. Essentially yes, this should be passed together with is_tty from the other PR.

@ilevkivskyi
Copy link
Member Author

@JukkaL thanks for review, I agree with all comments I didn't reply above.

@JukkaL
Copy link
Collaborator

JukkaL commented Sep 2, 2019

I actually like the new position more, since we are already taking much more vertical space, why not save a bit of horizontal space? Otherwise the line with wiggly arrow will have few information on it.

I think it's okay that the wiggly arrow is on its own line without anything else. The savings in horizontal space will be pretty minor, since error codes tend to be short.

Here's some reasoning why I'd rather not add the error code after the wiggly arrow (besides that it subjectively looks odd to me):

  • It makes the error code more prominent, even though most of the time it's the least interesting bit of information in the output. Now it kind of jumps at you, since it's the only alpha-numeric information on the line with the wiggly arrow.
  • It arguably changed the error formatting more than is necessary from the default mode. If we keep the error code in the original location, the new mode primarily adds some extra lines to the output but otherwise doesn't change things, which I like since it's simple and consistent. (Okay, it also does line wrapping, but that can be argued for on the basis of making it easier to see where the source line starts.)
  • Other tools tend to have the underlining/caret alone on a separate line, at least most of the time, so this seems to be a popular way of doing things. We don't need to follow what others do, but if we are unsure about something, falling back to common conventions may be a reasonable default.
  • We might want to put some other information in the location after/below the underlining in the future. See https://blog.rust-lang.org/2016/08/10/Shape-of-errors-to-come.html for some relevant Rust examples. Clang can also add some notes near the underlining (https://clang.llvm.org/diagnostics.html). I'd prefer to leave the space empty for future experiments. We could always move the error code back in the future, but it's a bit awkward if we have to move back and forth on an issue.

It is hard to define what is "token". [...]

Yes, there is a risk of doing something confusing. However, I think that the current approach is also confusing sometimes, especially if the error is at the end of the line, or if the target is a name that is much longer/shorter than the squiggle. I initially thought that it underlines a span of code, and was confused when this wasn't the case. A conservative option would be to only show a caret (^) and no squiggles, similar to what gcc does (or did), in the first implementation. We can always iterate on this later.

Here are other heuristics that could be built on top of the "caret only" approach that seem reasonably safe bets to me (but I haven't thought about this very carefully):

  • If the target is an identifier or a keyword, underline the identifier/keyword.
  • If the target starts an int/float/string/bytes literal, underline the literal.
  • Otherwise just show a caret.

As long as we can't always find the end span of an AST node, we may want to allow some extra options to be passed along with error messages that may affect how the length of the underlining is determined. For example, if the target expression is an attribute expression foo.bar, we can pass this as a flag, and the highlighter can try to find such an expression at the location using some heuristics. Yes, this will get increasingly hard to do if we want this to work always.

@ilevkivskyi
Copy link
Member Author

After addressing some comments it looks like this. I will continue tomorrow.

Screenshot 2019-09-02 22 07 47

@JukkaL
Copy link
Collaborator

JukkaL commented Sep 3, 2019

Thanks for the updates! I like how it looks now.

@emmatyping
Copy link
Member

Wow! This looks really good. Thank you for working on it Ivan!

@ilevkivskyi
Copy link
Member Author

@ethanhs I am glad you like it :-)

@JukkaL This is ready for another review.

@ilevkivskyi
Copy link
Member Author

Opened follow-up issues #7453 and #7454

Copy link
Collaborator

@JukkaL JukkaL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! There doesn't seem to be any tests for the dmypy client change -- maybe add at least one? Even if the change is trivial, it might get more complex in the future.

"Long[Type, Names]" are never split.
^^^^--------------------------------------------------
num_indent max_len
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the detailed comment!

@ilevkivskyi ilevkivskyi merged commit 91adf61 into python:master Sep 4, 2019
@ilevkivskyi ilevkivskyi deleted the show-code-snippet branch September 4, 2019 16:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Print source line with error and underline target of error
3 participants