Skip to content

bpo-43950: Specialize tracebacks for subscripts/binary ops #27037

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jul 12, 2021

Conversation

isidentical
Copy link
Member

@isidentical isidentical commented Jul 5, 2021

Examples;

 $ ./python t.py
Traceback (most recent call last):
  File "/home/isidentical/cpython/cpython/t.py", line 10, in <module>
    add_values(1, 2, 'x', 3, 4)
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/isidentical/cpython/cpython/t.py", line 2, in add_values
    return a + b + c + d + e
           ~~~~~~^~~
TypeError: unsupported operand type(s) for +: 'int' and 'str'
Traceback (most recent call last):
  File "/home/isidentical/cpython/cpython/t2.py", line 10, in <module>
    response['data']['segment1']['segment2']['segment3']
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^
TypeError: 'NoneType' object is not subscriptable

https://bugs.python.org/issue43950

@isidentical
Copy link
Member Author

For the traceback.py changes (and tests) I guess we should wait for @ammaraskar's changes on the traceback.py module.

@isidentical isidentical force-pushed the traceback-specialization branch from cc31c2d to 413d601 Compare July 5, 2021 18:06
@@ -512,8 +516,143 @@ _Py_DisplaySourceLine(PyObject *f, PyObject *filename, int lineno, int indent, i
return err;
}

/* AST based Traceback Specialization
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought from the title of the PR that it's related to Mark's bytecode specializations. Are you settled on using this word? Seems a bit overloaded now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are not relevant at all, but I can see the confusion. Any suggestions on alternative renaming?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's how clang's documentation talks about their carets: https://clang.llvm.org/diagnostics.html

So following from there maybe something like Precision traceback or Pointing traceback because we're pointing to a specific part in the expression or trying to make it more precise?

@pablogsal
Copy link
Member

@isidentical this may need rebasing now that the other PR is merged

@pablogsal
Copy link
Member

Let's focus on land this one next @ammaraskar @isidentical

@isidentical
Copy link
Member Author

@ammaraskar after your _format_traceback (or even before) can you try to port this to the traceback.py? I think it is going to look something like this but haven't integrated printing logic / tests yet.

+def _extract_anchors_from_segment(segment):
+    import ast
+
+    tree = ast.parse(segment)
+    if len(tree.body) == 1:
+        statement = tree.body[0]
+        match statement:
+            case ast.Expr(expr):
+                match expr:
+                    case ast.BinOp():
+                        operator_str = segment[expr.left.end_col_offset:expr.right.col_offset]
+                        operator_offset = len(operator_str) - len(operator_str.lstrip())
+                        return operator_offset, operator_offset + 1
+                    case ast.Subscript():
+                        return expr.value.end_col_offset, expr.slice.col_offset
+    return -1, -1

@ammaraskar ammaraskar force-pushed the traceback-specialization branch from 94801f4 to 26e900e Compare July 8, 2021 18:11
@isidentical isidentical marked this pull request as ready for review July 8, 2021 19:12
@isidentical isidentical force-pushed the traceback-specialization branch from d77011d to d79d365 Compare July 10, 2021 09:07
Copy link
Member

@ammaraskar ammaraskar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some basic stylistic suggestions mostly, looks good to me.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
@isidentical isidentical force-pushed the traceback-specialization branch from 4a9a967 to 26430d4 Compare July 10, 2021 17:54
@isidentical isidentical added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Jul 10, 2021
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @isidentical for commit 26430d4 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Jul 10, 2021
@ammaraskar
Copy link
Member

Ping @pablogsal

@isidentical
Copy link
Member Author

Apparently ASAN is failing, though not sure if it is relevant.

@isidentical isidentical reopened this Jul 11, 2021
err = PyFile_WriteString(" ", f);
if (err < 0) {
goto done;
if (source_line) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this guarded by source_line? How we can correctly calculate the offsets if we cannot get the line to calculate the characters? How can tHe code path when this is false and we continue with the bytes be corrent?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@isidentical This still applies, could you please check out what's going on here?

@pablogsal
Copy link
Member

@isidentical I have pushed 081738d to greatly simplify the code structure. I think it reads better now. Please, check it out

@isidentical
Copy link
Member Author

Thanks @pablogsal, it LGTM. Feel free to merge!

@pablogsal
Copy link
Member

@isidentical Please, check out #27037 (comment)

@isidentical
Copy link
Member Author

Ah, I think that is not relevant. I just checked again, and it seems like source_line should be always non-NULL, though since we are dealing with value propagation over functions it might be still wiser to just replace the if (source_line) { ... } with assert(source_line); ....

@isidentical
Copy link
Member Author

Also I am not really sure if we get any cases where primary/secondary will be different, but if so I think it would be nicer to make amendments to the traceback.py logic to do the same.

@pablogsal
Copy link
Member

Also I am not really sure if we get any cases where primary/secondary will be different, but if so I think it would be nicer to make amendments to the traceback.py logic to do the same.

In this PR or in a different one?

@pablogsal
Copy link
Member

Ah, I think that is not relevant. I just checked again, and it seems like source_line should be always non-NULL, though since we are dealing with value propagation over functions it might be still wiser to just replace the if (source_line) { ... } with assert(source_line); ....

We should also add some tests where we fail to parse the source. Is this covered currently?

@isidentical
Copy link
Member Author

In this PR or in a different one?

This PR.

@pablogsal
Copy link
Member

Also I am not really sure if we get any cases where primary/secondary will be different, but if so I think it would be nicer to make amendments to the traceback.py logic to do the same.

Also, I am not very sure what you refer to with "will be different". Can you elaborate?

We should also add some tests where we fail to parse the source. Is this covered currently?

I don't see this covered. Could you add a test for this?

@isidentical
Copy link
Member Author

isidentical commented Jul 12, 2021 via email

@isidentical isidentical force-pushed the traceback-specialization branch from 5d2036a to a67c8aa Compare July 12, 2021 17:28
@isidentical isidentical force-pushed the traceback-specialization branch from a67c8aa to 045c491 Compare July 12, 2021 18:03

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
@ammaraskar
Copy link
Member

We now have a test for when parsing fails and the source_line conditional is gone. Anything else to do here?

@pablogsal
Copy link
Member

We now have a test for when parsing fails and the source_line conditional is gone. Anything else to do here?

Yes, the most important step: celebrate 🎉

(also pray for the buildbots to not fail 😉 )

@pablogsal pablogsal deleted the traceback-specialization branch July 12, 2021 19:33
jeanas added a commit to jeanas/pygments that referenced this pull request Dec 28, 2021
Since python/cpython#27037, they can include
tildes in addition to the carets.
Anteru pushed a commit to pygments/pygments that referenced this pull request Dec 29, 2021
Since python/cpython#27037, they can include
tildes in addition to the carets.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants