-
-
Notifications
You must be signed in to change notification settings - Fork 32.1k
bpo-33550: Warn not to set SIGPIPE to SIG_DFL #6773
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hello, and thanks for your contribution! I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA). Unfortunately we couldn't find an account corresponding to your GitHub username on bugs.python.org (b.p.o) to verify you have signed the CLA (this might be simply due to a missing "GitHub Name" entry in your b.p.o account settings). This is necessary for legal reasons before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue. When your account is ready, please add a comment in this pull request Thanks again to your contribution and we look forward to looking at it! |
I have signed the CLA and made an account under bug.python.org as "splbio".
thank you. |
Doc/library/signal.rst
Outdated
for x in range(10000): | ||
print("y") | ||
|
||
if __name__ == '__main__': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two issues:
- this combines example code with test code: the flushes are there to force the issue, while the Exception handler is there solve the issue. A user might think they need the flushes.
- It is more typical, in code I've seen, to have nothing under the
if __name__
check other than a call to main and to have setup code happen inmain
or similar. Note that this might not be standard python style.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this combines example code with test code: the flushes are there to force the issue, while the Exception handler is there solve the issue. A user might think they need the flushes.
The user does need the flushes, otherwise some stdout may be buffered until after the python program exits the try/except block. You can demonstrate this by removing the flush and piping this program into head -1
.
That said the flush on stderr is probably not needed and I can remove it.
I've spun another version moving the code into main() and removing the flush of stderr. The flush of stdout must remain as far as I can tell.
Diff feedback taken, moved all code into main(). |
@grimreaper Do I need to open an issue in the bug tracker for this and/or create a news entry? Or are small doc changes such as this exempt from both or one of these requirements? ty. |
I am not a core developer and am not fully aware of the customs here. From what I've seen even more trivial changes get a bug and a news entry (via "blurb") so I'd expect this one to as well. |
Doc/library/signal.rst
Outdated
Note on SIGPIPE | ||
--------------- | ||
|
||
Do not set set :const:`SIGPIPE`'s disposition to :const:`SIGDFL` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Set is doubled.
I think SIG_DFL has an underscore, both in Python and in C.
Doc/library/signal.rst
Outdated
|
||
This will cause your program to exit in unexpected ways because Python | ||
will exit if a write to any socket which is the remote side has closed | ||
or dropped occurs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would make more sense without is. Also may be easier to understand with occurs closer to write: “a write occurs to any socket which the remote side has closed or dropped.”
@vadmium The issues noted are cleaned up. Thank you. |
651e623
to
51bb421
Compare
@vadmium, is this ready to go? |
a5a7f0f
to
3240b8a
Compare
@@ -0,0 +1 @@ | |||
Warn not to set SIGPIPE to SIG_DFL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A NEWS entry is not required for small documentation patches.
88a057b
to
e307258
Compare
@berkerpeksag I have removed the unneeded news entry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn’t read the proposal thoroughly before, sorry. I agree that Python’s documentation or library could support better handling of interrupts and broken pipes in end-user programs. But I have a few concerns.
One is that you seem to confuse the signal SIGPIPE with the BrokenPipeError exception. Yes, outputting to a broken pipe can cause a SIGPIPE signal, but so can os.kill. If Python ignores this signal, it has no effect on its own. It is the OS setting errno to EPIPE that triggers BrokenPipeError.
The other is about the second paragraph. You discourage setting SIGPIPE to SIG_DFL, because that may “cause your program to exit unexpectedly”. But the alternative seems to be calling os._exit without even an error message. This is subtler than the default Python exception message, and the SIG_DFL behaviour, where you may at least discover that Python was terminated by SIGPIPE by checking the equivalent of os.WTERMSIG.
A better solution is often to ignore BrokenPipeError or skip the output and continue to the next stage in the program.
Doc/library/signal.rst
Outdated
# while inside this try block. | ||
sys.stdout.flush() | ||
except BrokenPipeError: | ||
os._exit(1) # Python exits with error code 1 on EPIPE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don’t understand why you suggest os._exit. Usually people use SystemExit or sys.exit, which triggers try / finally handlers, flushes sys.__stderr__
, calls atexit handlers, and so on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you use os.exit() the following output happens:
22:56:46 ~ % cat t2.py
import os
import sys
def main():
try:
for x in range(10000):
print("y")
# flush output here to force SIGPIPE to be triggered
# while inside this try block.
sys.stdout.flush()
except BrokenPipeError:
sys.exit(1) # Python exits with error code 1 on EPIPE
if __name__ == '__main__':
main()
22:56:50 ~ % python3 t2.py | head -1
y
Exception ignored in: <_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>
BrokenPipeError: [Errno 32] Broken pipe
22:56:52 ~ %
Since I can't seem to find a way for python NOT to flush on sys.exit(), this code works, but is very ugly, however it's safer since you are correct about os._exit() skipping shutdown handlers....
This works, so maybe we should amend the code to be like this (removing the unneeded comments and writes to stderr):
22:59:21 ~ % cat t.py
import os
import sys
def main():
try:
for x in range(10000):
print("y")
# flush output here to force SIGPIPE to be triggered
# while inside this try block.
sys.stdout.flush()
except BrokenPipeError:
sys.stderr.write("BPE 1\n")
try:
sys.stdout.close()
#sys.stdout.flush()
pass
except BrokenPipeError:
sys.stderr.write("BPE 2\n")
pass
#os._exit(1) # Python exits with error code 1 on EPIPE
sys.exit(1) # Python exits with error code 1 on EPIPE
finally:
sys.stdout.close()
if __name__ == '__main__':
main()
22:59:26 ~ % python3 t.py | head -1
y
BPE 1
BPE 2
22:59:47 ~ %
The problem here is that python doesn't seem to have a method to "fpurge(3)" a stream which would work better than double closing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to work... I wish there was a way to "fpurge(3)" a python file object instead.
import os
import sys
def main():
try:
for x in range(10000):
print("y")
# flush output here to force SIGPIPE to be triggered
# while inside this try block.
sys.stdout.flush()
except BrokenPipeError:
devnull = os.open(os.devnull, os.O_WRONLY)
os.dup2(devnull, sys.stdout.fileno())
sys.exit(1) # Python exits with error code 1 on EPIPE
if __name__ == '__main__':
main()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also python doesn't exit with the correct status code when it catches sigpipe, it returns with exit code 1. (I tested this with Lukasz)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with the latest iteration. We just need to comment on why we're duping stdout to devnull. As simple as "Python flushes standard streams on exit; redirect remaining output to devnull to avoid another BrokenPipeError at shutdown".
print("y") | ||
# flush output here to force SIGPIPE to be triggered | ||
# while inside this try block. | ||
sys.stdout.flush() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps it would be clearer if you made it an indefinite while True loop; then you wouldn’t need the flush
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The goal here is to give a recipe which covers common uses. Showing that it's important to use sys.stdout.flush() because not all programs are an infinite loop, a significant portion do a single task outputting some output and that flush is required to avoid the pipe error.
@@ -503,3 +503,32 @@ be sent, and the handler raises an exception. :: | |||
|
|||
signal.alarm(0) # Disable the alarm | |||
|
|||
Note on SIGPIPE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, I don't think this is suitable for the signal module documentation. Maybe we can create a HOWTO document for signals? A HOWTO document would also allow us adding more best practices in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Until we have such document, I'm fine putting this in the documentation as is so it's searchable.
Doc/library/signal.rst
Outdated
--------------- | ||
|
||
Piping output of your program to tools like :manpage:`head(1)` will | ||
cause a SIGPIPE signal to be sent to your process when the receiver |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SIGPIPE -> :const:`SIGPIPE`
7a0511b
to
f9c05c4
Compare
Updates from @berkerpeksag (const for SIGPIPE) and @ambv (comments why devnull dup2'd over stdout) taken. Also rebased onto master. I wonder if there should be a standard decorator for this since it's pretty problematic pattern? I may make a package shortly. Advice please? |
I've inlined here a program that illustrates why setting SIGPIPE to SIG_DFL is insidious and the subtle nature of the bugs it introduces. Please run it with:
And you will see it exit when trying to do network communications. This is the "unexpected behavior/exit" that I am warning against and why os._exit() while not optimal is actually not as bad as SIGPIPE->SIG_DFL. I understand that this is a complex corner case, but really a python program should almost never, ever, set SIGPIPE to SIG_DFL or it risks the entire program shutting down unexpectedly if it does interactions with sockets as part of its lifecycle.
|
f9c05c4
to
f129008
Compare
@ambv: Please replace |
GH-8798 is a backport of this pull request to the 3.7 branch. |
(cherry picked from commit a251073) Co-authored-by: Alfred Perlstein <[email protected]>
* master: (107 commits) bpo-22057: Clarify eval() documentation (pythonGH-8812) bpo-34318: Convert deprecation warnings to errors in assertRaises() etc. (pythonGH-8623) bpo-22602: Raise an exception in the UTF-7 decoder for ill-formed sequences starting with "+". (pythonGH-8741) bpo-34415: Updated logging.Formatter docstring. (pythonGH-8811) bpo-34432: doc Mention complex and decimal.Decimal on str.format not about locales (pythonGH-8808) bpo-34381: refer to 'Running & Writing Tests' in README.rst (pythonGH-8797) Improve error message when mock.assert_has_calls fails (pythonGH-8205) Warn not to set SIGPIPE to SIG_DFL (python#6773) bpo-34419: selectmodule.c does not compile on HP-UX due to bpo-31938 (pythonGH-8796) bpo-34418: Fix HTTPErrorProcessor documentation (pythonGH-8793) bpo-34391: Fix ftplib test for TLS 1.3 (pythonGH-8787) bpo-34217: Use lowercase for windows headers (pythonGH-8472) bpo-34395: Fix memory leaks caused by incautious usage of PyMem_Resize(). (pythonGH-8756) bpo-34405: Updated to OpenSSL 1.1.0i for Windows builds. (pythonGH-8775) bpo-34384: Fix os.readlink() on Windows (pythonGH-8740) closes bpo-34400: Fix undefined behavior in parsetok(). (pythonGH-4439) bpo-34399: 2048 bits RSA keys and DH params (python#8762) Make regular expressions in test_tasks.py raw strings. (pythonGH-8759) smtplib documentation fixes (pythonGH-8708) Fix misindented yaml in logging how to example (pythonGH-8604) ...
(cherry picked from commit a251073) Co-authored-by: Alfred Perlstein <[email protected]>
https://bugs.python.org/issue33550
An anti-pattern often seen in python programs is reset SIGPIPE handling to SIG_DFL (process should terminate) for cli tools in order to avoid "messy" output on stderr when the cli tool is piped into a program such as head(1).
The problem with doing this is that if your program happens to talk to anything via sockets or pipes (even via a library) setting SIGPIPE handling to SIG_DFL means that if the remote side closes the connection, or the network drops, your program will instantly terminate instead of following any retry logic. This results in difficult to debug "why is my tool exiting for no reason?" bugs for cli tools.
Since over the years I've seen this anti-pattern often enough, and often enough it's caused problems in the various environments I've worked in, it deserves some form of documentation somewhere.
https://bugs.python.org/issue33550