Skip to content

bpo-36829: Add sys.unraisablehook() #13187

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 22, 2019
Merged

bpo-36829: Add sys.unraisablehook() #13187

merged 3 commits into from
May 22, 2019

Conversation

vstinner
Copy link
Member

@vstinner vstinner commented May 8, 2019

Add new sys.unraisablehook() function which can be overriden to
control how "unraisable exceptions" are handled. It is called when an
exception has occurred but there is no way for Python to handle it.
For example, when a destructor raises an exception or during garbage
collection (gc.collect()).

https://bugs.python.org/issue36829

@serhiy-storchaka
Copy link
Member

The original proposition was about conditionally aborting the current process. Raising an exception (in particularly by os.exit()) does not work. Should not we add os.abort()? Or there are better alternatives for post-mortem investigation?

Python/errors.c Outdated
args[1] = (exc_value ? exc_value : Py_None);
args[2] = (exc_tb ? exc_tb : Py_None);
args[3] = (obj ? obj : Py_None);
res = _PyObject_FastCall(hook, args, Py_ARRAY_LENGTH(args));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would not be better to call _PyErr_WriteUnraisable() if the hook set an exception instead of silently ignore it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum, it's hard to know if the hook started to log something or not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could do something like logging: log a special message when the custom hook fails? We should just prevent recursive calls 😁

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no recursion. I propose to call _PyErr_WriteUnraisable(), not PyErr_WriteUnraisable(). And pass hook as the obj argument.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyErr_WriteUnraisable() now logs exception in custom unraisablehook using _PyErr_WriteUnraisable().

PyObject *exc_value, PyObject *exc_tb, PyObject *obj)
/*[clinic end generated code: output=f83f725d576ea3d3 input=2aba571133d31a0f]*/
{
return _PyErr_WriteUnraisable(exc_type, exc_value, exc_tb, obj);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not set sys.unraisablehook to None by default? Or not set it at all? What is the use case for exposing it at Python level?

Copy link
Member Author

@vstinner vstinner May 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to use a design similar to sys.breakpointhook and sys.excepthook. It seems useful to be able to call sys.__unraisablehook__ from a custom hook for example.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a difference between sys.breakpointhook and sys.unraisablehook. The former is only called when you explicitly call breakpoint(). The latter can be called implicitly at arbitrary place and time. It is out of your control. sys.excepthook is also called implicitly, but at known circumstances -- when an uncaught exception achieves the top level. I afraid that calling sys.unraisablehook at arbitrary place may be unsafe.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sys.unraisablehook takes (exc_type, exc_value, exc_tb, obj) arguments, it doesn't rely on the current exception. Why would it be called to call it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry: Why would it be unsafe to call it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example because it can be called deep in the C stack.

@graingert
Copy link
Contributor

@vstinner PyErr_WriteUnraisable can get called during __del__ or GC so most of the python environment could have already been deleted by the time the hook is called,

Or the hook could be called after the hook is removed by GC and then we're back to the same problem

@graingert
Copy link
Contributor

the other PR for reference: #13175

@vstinner
Copy link
Member Author

vstinner commented May 8, 2019

@vstinner PyErr_WriteUnraisable can get called during del or GC so most of the python environment could have already been deleted by the time the hook is called, Or the hook could be called after the hook is removed by GC and then we're back to the same problem

It works as expected on my tests: https://bugs.python.org/issue36829#msg341868

@vstinner
Copy link
Member Author

vstinner commented May 8, 2019

Should not we add os.abort()?

That sounds like a reasonable addition, expose C abort() as os.abort().

@vstinner
Copy link
Member Author

vstinner commented May 8, 2019

Without os.abort(), you can call abort() using "import signal; signal.raise_signal(signal.SIGABRT)". But it only works if SIGABRT signal handler has not been overriden. Or maybe you can call "signal.signal(signal.SIGABRT, signal.SIG_DFL)" to restore the original signal handler.

@vstinner
Copy link
Member Author

vstinner commented May 8, 2019

Or the hook could be called after the hook is removed by GC and then we're back to the same problem

My implementations uses the default builtin hook if sys.unraisablehook doesn't exist or has been set to None.

I am not sure if we can trick PyImport_Cleanup to ensure that custom hook remains alive as long as possible.

PyObject *exc_value, PyObject *exc_tb, PyObject *obj)
/*[clinic end generated code: output=f83f725d576ea3d3 input=2aba571133d31a0f]*/
{
return _PyErr_WriteUnraisable(exc_type, exc_value, exc_tb, obj);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a difference between sys.breakpointhook and sys.unraisablehook. The former is only called when you explicitly call breakpoint(). The latter can be called implicitly at arbitrary place and time. It is out of your control. sys.excepthook is also called implicitly, but at known circumstances -- when an uncaught exception achieves the top level. I afraid that calling sys.unraisablehook at arbitrary place may be unsafe.

Python/errors.c Outdated
args[1] = (exc_value ? exc_value : Py_None);
args[2] = (exc_tb ? exc_tb : Py_None);
args[3] = (obj ? obj : Py_None);
res = _PyObject_FastCall(hook, args, Py_ARRAY_LENGTH(args));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no recursion. I propose to call _PyErr_WriteUnraisable(), not PyErr_WriteUnraisable(). And pass hook as the obj argument.

Python/errors.c Outdated
void
PyErr_WriteUnraisable(PyObject *obj)
PyObject*
_PyErr_WriteUnraisable(PyObject *exc_type, PyObject *exc_value,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make it static.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is called by sys_unraisablehook_impl() in sysmodule.c.

@vstinner
Copy link
Member Author

vstinner commented May 9, 2019

I modified my PR to log exception if custom sys.unraisablehook itself raises an exception: see the new dedicated unit test.

@@ -98,6 +98,12 @@ PyAPI_FUNC(_PyInitError) _Py_PreInitializeFromCoreConfig(
const _PyCoreConfig *coreconfig,
const _PyArgv *args);

PyAPI_FUNC(PyObject*) _PyErr_WriteUnraisable(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why it is exposed in the header? Should not it be the static function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sys.unraisablehook() calls _PyErr_WriteUnraisable().

Python/errors.c Outdated
goto done;
if (PyFile_WriteObject(obj, f, 0) < 0) {
if (hook_failed) {
if (PyFile_WriteString("sys.unraisablehook failed:", file) < 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not show what exception was raised?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is logged (see my comment below).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I often wish PyErr_WriteUnraisable() take an optional C string argument for describing the context. In many case there is no appropriate callable which can be passed as obj, sou you pass NULL and lost the context.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a new _PyErr_WriteUnraisableMsg() function exactly for that. I agree that PyErr_WriteUnraisable(NULL) is hard to debug. I only used it twice to show how it can be used. We can bikeshed later on converting existing PyErr_WriteUnraisable(NULL) to _PyErr_WriteUnraisableMsg() :-)

Python/errors.c Outdated
PyObject *res = _PyObject_FastCall(hook, args, Py_ARRAY_LENGTH(args));
Py_XDECREF(res);
if (res == NULL) {
unraisable_hook_failed();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to just fall back to the standard handler called below.

if (hook != NULL && hook != Py_None) {
    PyErr_Fetch(&exc_type, &exc_value, &exc_tb);
    ... // call the hook
    Py_XDECREF(exc_type);
    Py_XDECREF(exc_value);
    Py_XDECREF(exc_tb);
    if (res != NULL) {
        goto done;
    }
    obj = hook;
}
write_unraisable(...);
done:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I inlined unraisable_hook_failed() to make the fallback more explicit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you forget to push your changes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't write exactly what you proposed, but what I wrote is basically the same without goto. I dislike goto and "obj = hook" here. I prefer a more explicit fallback.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well. I modified my PR to write an explicit fallback as you asked me for :-)

@jdemeyer
Copy link
Contributor

jdemeyer commented May 9, 2019

I'm not sure that it's safe to run arbitrary Python code any time that PyErr_WriteUnraisable is called. There might be occasions where PyErr_WriteUnraisable is called but where CPython is internally in an inconsistent state where calling Python code would crash the interpreter.

@vstinner
Copy link
Member Author

vstinner commented May 9, 2019

When a custom hook fails, the exception is logged. For example, test_custom_unraisablehook_fail() checks for "Exception: hook_func failed" in the output.

Another example:

import _testcapi
import sys

def hook_func(*args):
    raise Exception("hook_func failed")

sys.unraisablehook = hook_func
_testcapi.write_unraisable(ValueError(42), None)

Output:

sys.unraisablehook failed:
Traceback (most recent call last):
  File "x.py", line 5, in hook_func
    raise Exception("hook_func failed")
Exception: hook_func failed

@graingert
Copy link
Contributor

graingert commented May 9, 2019 via email

@vstinner
Copy link
Member Author

vstinner commented May 9, 2019

I'm not sure that it's safe to run arbitrary Python code any time that PyErr_WriteUnraisable is called. There might be occasions where PyErr_WriteUnraisable is called but where CPython is internally in an inconsistent state where calling Python code would crash the interpreter.

Do you have an example? I'm not aware of such issue.

Python must always remain consistent. Usually, Py_FatalError() is called when an inconstancy is detected.

@vstinner
Copy link
Member Author

vstinner commented May 9, 2019

If the hook fails, will it abort the process with a non-zero exit code?

No, the second new error is logged. Python should not be aborted when something goes wrong: it should be possible to continue the execution.

Another complete solution for https://bugs.python.org/issue36829 would be to add a new -X command line option to use a specific exit code if at least one "unraisable" exception is logged. So far, nobody showed an example where my hook cannot be used to fix https://bugs.python.org/issue36829

@vstinner
Copy link
Member Author

vstinner commented May 9, 2019

See https://bugs.python.org/issue36829#msg342000 for PyErr_WriteUnraisable() called very late during Python shutdown.

@vstinner
Copy link
Member Author

I added "msg" to sys.unraisablehook.

Ok, so I rebased my PR on top of master to reduce the risk of conflicts, I squashed my commits to be able to update the main commit message, and I added a new "msg" parameter to sys.unraisablehook.

The new "msg" parameter allows to pass an arbitrary error message rather than the hardcoded "Exception ignored in: ". I added _PyErr_WriteUnraisableMsg() to log an unraisable exception with an error message.

I only used _PyErr_WriteUnraisableMsg() twice where it was appropriate to show how it can be used. If the PR is accepted, a following PR can be written to add a more accurate error message (than the hardcoded error message). I prefer to keep this PR as small as possible.

I also cleaned up the implementation and added more comments.

@vstinner vstinner requested review from pablogsal and methane May 10, 2019 23:29
@vstinner
Copy link
Member Author

cc @methane @pablogsal

@vstinner
Copy link
Member Author

@serhiy-storchaka: Would you mind to review the updated PR? Do you like the new API?

@vstinner
Copy link
Member Author

I plan to merge this PR at the end of the week. I addressed all remarks/requests.

The only remaining complain about this PR is the fact that it doesn't allow to catch exceptions raised very late during Python shutdown: https://bugs.python.org/issue36829#msg342001

I confirm that it's a known limitation, but I am not comfortable with killing (SIGABRT) Python when PyErr_WriteUnraisable() is called late during Python finalization: there is no "easy" way to debug such issue. Only a low-level debugger like gdb can maybe provide some info about what happened. Or maybe not since Python finalization already destroyed too many things: modules, thread state, etc.

Python finalization code is very fragile and it's a work-in-progress for 5 years. I would love to see enhancements, but multiple attempts failed badly with adding new corner cases and triggering new crashes, and so had to be reverted.

My PR is a compromise for the most common cases. IMHO it's a way more generic solution than only allow to kill the process with SIGABRT. You are free to log "unraisable exceptions" into a file, raise a signal, send exceptions into a network socket, etc. Basically, whatever you want ;-)

@jdemeyer
Copy link
Contributor

Do you have an example? I'm not aware of such issue.

I don't know any concrete example, it was just a general comment.

Python must always remain consistent.

I meant temporary inconsistency, when you are in the middle of updating some data structure.

@vstinner
Copy link
Member Author

I meant temporary inconsistency, when you are in the middle of updating some data structure.

Ok, I see what you mean. If we have such code, it's a bug that should be fixed.

The default hook implementation, current PyErr_WriteUnraisable() implementation, already executes "arbitrary" Python code:

  • call repr(exception_value) which can be an arbitrary Python function
  • get exception_type.module attribute
  • call sys.stderr.write() whereas sys.stderr can be a custom object
  • displaying the traceback calls io.open(filename, "rb"), parse the encoding cookie, create a io.TextIOWrapper, call fp.readline() and then fp.close()
  • etc.

Moreover, the GC is not disabled and can be triggered anytime. So it's important that everything remains consistent when PyErr_WriteUnraisable() is called.

@jdemeyer
Copy link
Contributor

OK, I agree. Like I said, it was just a comment, not a complaint.

@vstinner
Copy link
Member Author

OK, I agree. Like I said, it was just a comment, not a complaint.

Thanks for sharing your concerns. This issue is very tricky, I wouldn't be surprised if we discover a corner case tomorrow ;-) So having more people looking into the code and how it's used is very helpful!

@vstinner
Copy link
Member Author

New version of my API. sys.unraisablehook now gets a single argument which has 4 fields:

  • exc_type
  • exc_value
  • exc_tb
  • obj

It becomes possible to extend this object later to add new attribute without breaking the backward compatibility (existing hooks used in the wild).

I also reverted unrelated changes.

I removed _PyErr_WriteUnraisableMsg() and the "err_msg" parameter / field. I tried to write a PR as small as possible. Once this PR will be merged, I will work on a second PR to add a new err_msg field to the hook and add back _PyErr_WriteUnraisableMsg() function.

@zooba zooba self-requested a review May 16, 2019 15:48
Copy link
Member

@zooba zooba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments here, I put my bigger thoughts on python-dev :)

@vstinner
Copy link
Member Author

_PySys_GetObjectId() returns a borrowed reference.

To be clear, I trust no one, especially myself: since I started to work on this PR, I'm running frequently "./python -m test test_sys -R 3:3" to make sure that nothing leaks :-) I added multiple tests in test_sys, and it happened multiple times that I introduced a giant leak because of an obvious bug in my code :-D

@vstinner
Copy link
Member Author

@pablogsal: I addressed your comments.

Add new sys.unraisablehook() function which can be overriden to
control how "unraisable exceptions" are handled. It is called when an
exception has occurred but there is no way for Python to handle it.
For example, when a destructor raises an exception or during garbage
collection (gc.collect()).

Changes:

* The default hook now ignores exception on writing the traceback.
* Add an internal UnraisableHookArgs type used to pass arguments to
  sys.unraisablehook.
* Add _PyErr_WriteUnraisableDefaultHook().
* test_sys now uses unittest.main() to automatically discover tests:
  remove test_main().
* Add _PyErr_Init().
* Fix PyErr_WriteUnraisable(): hold a strong reference to sys.stderr
  while using it
@vstinner
Copy link
Member Author

I squashed my commits, rebased them on master and fix a merge conflict.

vstinner added 2 commits May 20, 2019 01:15
* Rename 'exc_tb' field to 'exc_traceback'
* Rename 'obj' field to 'object'
* Fix PyErr_WriteUnraisable(): don't call sys.unraisablehook if
  exc_type is NULL (this case should not happen)
* Documentation: add links between sys.excepthook
  and sys.unraisablehook
* Fix typo in the doc
* Update PyErr_WriteUnraisable() documentation
* Regenerate importlib.h
@vstinner
Copy link
Member Author

@serhiy-storchaka @pablogsal @njsmith: Would you mind to review my PR adding sys.unraisablehook?

I renamed UnraisableHookArgs fields to (exc_type, exc_value, exc_traceback, object). I also fixed a few more bugs in the implementation to fix a few more corner cases like exc_type == NULL.

I also completed the documentation.

@vstinner
Copy link
Member Author

I plan to merge this change next Wednesday (in 2 days). I didn't see any strong opinion to the API, and I'm not convinced by other proposed APIs.

Once this PR will be merged, I'm interested to work on follow-up (I plan to write 1 PR to bullet):

  • Add an optional error message to UnraisableHookArgs to override default "Exception ignored in:"
  • Try to normalize the exception
  • Try to create a traceback object and attack it to the exception
  • Add a helper function to test.support to catch unraisable exception and modify unit tests generating such unraisable exception
  • Maybe also experiment to modify regrtest to log these unraisable exceptions to display them again in the summary at the end

@vstinner vstinner merged commit ef9d9b6 into python:master May 22, 2019
@vstinner vstinner deleted the unraisablehook branch May 22, 2019 09:28
@vstinner
Copy link
Member Author

Thanks for reviews, I merged my PR!

@vstinner
Copy link
Member Author

Follow-up:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants