-
Notifications
You must be signed in to change notification settings - Fork 25
Improve ExitCodeException
Show
instance
#83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
It seem like this uses But I don't really understand why we strip anyway. Isn't that misleading? It seems to me it would be better to add a newline after printing the output regardless of whether it also ended with a newline. |
I pushed a commit that, I think, corrects a test (I didn't really understand why that was broken) and another one that does no stripping. Personally I prefer the no stripping behavior, because of the principle of least surprise. However, if you want to add |
(But preferably stripping done using ASCII, not |
@tomjaguarpaw How about only stripping the end of the output? This would normalize tools writing zero, one, or more newlines but keep leading whitespace intact. |
Before, the arrangement of newlines in the `ExitCodeException` `Show` instance grouped stdout closer to the stderr header than the stdout header: ghci> readProcess_ $ proc "sh" ["-c", "echo this is stdout; echo this is stderr >&2; false"] *** Exception: Received ExitFailure 1 when running Raw command: sh -c "echo this is stdout; echo this is stderr >&2; false" Standard output: this is stdout Standard error: this is stderr If there was no trailing newline for the stdout, the output would be formatted with no newline between the end of the stdout and the start of the stderr header: ghci> readProcess_ $ proc "sh" ["-c", "nix path-info --json nixpkgs#agda && false"] *** Exception: Received ExitFailure 1 when running Raw command: sh -c "nix path-info --json nixpkgs#agda && false" Standard output: [{"path":"/nix/store/sj2z0h5ywlflqv50dfphwia6p0ij0mlj-agdaWithPackages-2.6.4.3","valid":false}]Standard error: these 5 paths will be fetched (18.30 MiB download, 133.19 MiB unpacked): /nix/store/5q0kb0nqnqcfs7a0ncsjq4fdppwirpxa-Agda-2.6.4.3-bin /nix/store/xmximjjnkn0hm4gw7akc9f20ydz6msmk-Agda-2.6.4.3-data /nix/store/sj2z0h5ywlflqv50dfphwia6p0ij0mlj-agdaWithPackages-2.6.4.3 /nix/store/b49sa2q0yb3fd14ppzh6j6rm8vvgr9n6-ghc-9.6.6-with-packages /nix/store/vharimf7f2glj4fyhiglzws0qyv4xrry-libraries Now, the output is grouped more consistently and displays nicely regardless of trailing or leading newlines in the output: ghci> readProcess_ $ proc "sh" ["-c", "echo this is stdout; echo this is stderr >&2; false"] *** Exception: Received ExitFailure 1 when running Raw command: sh -c "echo this is stdout; echo this is stderr >&2; false" Standard output: this is stdout Standard error: this is stderr ghci> readProcess_ $ proc "sh" ["-c", "nix path-info --json nixpkgs#agda && false"] *** Exception: Received ExitFailure 1 when running Raw command: sh -c "nix path-info --json nixpkgs#agda && false" Standard output: [{"path":"/nix/store/sj2z0h5ywlflqv50dfphwia6p0ij0mlj-agdaWithPackages-2.6.4.3","valid":false}] Standard error: these 5 paths will be fetched (18.30 MiB download, 133.19 MiB unpacked): /nix/store/5q0kb0nqnqcfs7a0ncsjq4fdppwirpxa-Agda-2.6.4.3-bin /nix/store/xmximjjnkn0hm4gw7akc9f20ydz6msmk-Agda-2.6.4.3-data /nix/store/sj2z0h5ywlflqv50dfphwia6p0ij0mlj-agdaWithPackages-2.6.4.3 /nix/store/b49sa2q0yb3fd14ppzh6j6rm8vvgr9n6-ghc-9.6.6-with-packages /nix/store/vharimf7f2glj4fyhiglzws0qyv4xrry-libraries The `Show` instance for `ProcessConfig` has also been touched up, removing edge cases like an empty "Modified environment" header: ghci> putStrLn $ show $ setEnv [] $ proc "sh" [] Raw command: sh Modified environment: Extraneous trailing newlines in `Show` instances have also been removed.
2a56450
to
dcadf7f
Compare
Alright, I've pushed a commit to remove the whitespace stripping behavior from the Here's the normal case, where standard output ends with a newline.
Meanwhile, if we have a command that includes both stdout and stderr and doesn't output a newline at the end of its stdout, the blank line separating the
Also,
|
dcadf7f
to
189aa42
Compare
Thanks. I appreciate this version doesn't do everything you want, but I'm much more comfortable with it, so if you consider it an improvement let's go for it. You are welcome to subsequently continue to advocate for your desired end goal. However, this still uses |
What encoding does This To (hopefully) give some weight to my opinion here, I'm a co-author for the L2/21-235 proposal which added the Symbols for Legacy Computing Unicode block. |
And in fact
This is a bug, this is almost always the wrong behavior on any computer newer than the 1990s, and it's easy to fix. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No comments on the code itself, but some ideas on how to make the tests easier on the eye.
Co-authored-by: Simon Hengel <[email protected]>
56fcb70
to
c05ee1f
Compare
[Sorry, pressed Enter too early. Comment to follow.] |
This is a fair point. I take the point that if we're going to choose anything then UTF-8 is the most inclusive choice. The current version doesn't strip terminal control codes, for example! My personal preference would to be to make the choice explicit by not displaying stdout and stderr in the I went back to look at where the choice of Since he is the primary maintainer of the repository I'll leave the final call to him. Thank you for your patience so far @9999years. |
Any updates here? |
@tomjaguarpaw I don't have any strong opinions here, I'm totally comfortable with you making whichever decision you feel best about. |
I'm finding this all quite confusing, because the library supports a variety of sorts of behaviour, and I don't see any principles on which to judge current behaviour "right" (should be preserved) or "wrong" (needs changing). I agree that when you use
However, if you run the
Furthermore, if you use So I guess if one doesn't like the formatting of the exception in @9999years, please let me know what you think. |
@tomjaguarpaw I'm not sure what you're asking. Is there an ambiguity in my previous comments? There's a couple small changes here:
I believe I've explained why I believe all three of these changes are good defaults that improve the status quo of the library. In all cases, escape hatches are available (as you've noted) for users that need more specific behavior, e.g. for legacy computing applications that require non-standard character encodings. This is ultimately a very small change to improve the ergonomics for the majority of users. |
I'm not trying to suggest your previous comments are ambiguous. I'm asking what you think of my analysis in my most recent comment. I personally don't like the existing design of including I also don't particularly see the point of including the This is all to say, I don't understand the design of Regarding the change to the printing of environment of |
This misses the point, in my opinion. As I've said before, In reality, writing to
This is fair, and if you'd like to default to WTF-16 on Windows I don't have any objections. However, even on Windows, UTF-8 is more and more common these days.
Absolutely not! Remember, this patch is about useful defaults. The point is for the error message to be helpful. You do not generally want Haskell programs to fail with "Exit code 1" without any additional information about which process exited with code 1 (or what that process wrote to
Frankly, I'm not really sure what there is to understand: it runs a process and throws an exception if the process fails. The exception should include as much information as possible to assist users with debugging failures.
That's correct. (I'd like to improve the situation with a fix to #82.) Happy to change it if you'd like, but I really just want to get this merged. If you have any concrete objections, please let me know, but given that we've been running in circles for nearly a year I think it's time to merge this PR. |
As an example, I regularly get annoyed at Haskell programs like |
@9999years, I appreciate that from your point of view this PR may look like a obvious slamdunk to improve the library and help users. However, I want to ensure I understand
I apologise if these all seem terribly obvious. Regrettably, that they are not obvious to me, and I request your continued patience. Exhortations to "just get this merged" because "it's time to merge this PR" are likely to have the opposite of the effect you desire. To help set expectations, I am not yet in the "just get this merged" stage of the PR review process, I'm in the "what is the current design even for, and what alternative approach may exist" stage. The purpose, from my point of view, of this back and forth is to make progress on those above three goals. I'm sorry if the back and forth seems not to directly serve your purpose.
I sort of agree. I think the library "should do the right thing as often as possible" and "need not be perfect". However, I don't see that applies to
I agree, but I don't see why There are three components to this PR:
Item 1 seems important! Output is garbled for anyone using non-ASCII. Item 2 seems like a step backwards to me. I'm happy to discuss it further, but it seems like it warrants it's own PR or issue. Item 3 seems fine, but I think it should be discussed separately, because otherwise it's just going to get lost amongsts the weeds of 1 and 2. Now, the continued discussion has been very helpful to me because I can now understand the purpose of do
(out1, err1) <- readProcess_ ...
(out2, err2) <- readProcess_ ...
(out3, err3) <- readProcess_ ... whilst exiting the Haskell process with an informative error (including the process configuration, stdout and stderr) on subprocess exit failure. It also supports other behaviour: the The problem with the design is that it's impossible to store stdout and stderr as (Ultimately this is a consequence of Haskell's default exception ecosystem, which is very poor in my opinion, and why I recommend wrapped-IO effect systems such as Bluefin or effectful instead. If we had been using one of those this problem would not have arisen! So if we insist on continuing to use an exception in this way we are forced to choose encoding up front.
There is an option which allows us to use an exception yet at least have a hope of determining the correct encoding: determine the encoding at the time of throwing the exception! We could inspect the terminal settings and store the encoding in the If we don't insist on using an exception at all we can take a different approach, for example, we could define So, the choice before us seems to be between doing either, both, or neither of changing
I would welcome further thoughts on this analysis (I am not attempting to suggest that anything anyone has written to this point is ambiguous in any way). |
I don't think that's even necessary, I think you could look at LC_CTYPE/LANG/etc of the run process (and thus during exception creation). But that's not any guarantee that's what actually is output into the terminal. However, I have no idea what the encoding conversion situation is like in Haskell, and I especially have no idea if people using these alternate encodings are actually setting the libc encoding settings (I expect often not). However, this is besides the point. Improving the display to not corrupt it by default in the majority of cases is a strict improvement over the status quo which corrupts it much more often; the display here is for user facing purposes, and it is a rare state of affairs that people are not using UTF-8 in practice. Making this change does not make any new promises that might be broken later when further incremental improvements are done, such as supporting non-UTF-8 encodings. I am pretty sure that e.g. Python and Rust don't bother doing any non-UTF-8 encodings for their equivalent features. The more likely case of it getting garbled is just processes outputting non-textual output. That just is what it is. Not much to be done about that, besides using a unicode replacement character on invalid characters so it doesn't break anything too badly.
This is really not a usable solution if anyone uses it from a web app or a library used by one, since it detaches the terminal output from the exception context. Or, basically anything else that sends exceptions as telemetry off the machine. Having a library unexpectedly write to stderr is highly surprising behaviour and I don't think it's a good option. If someone cares a lot about non-UTF-8 encodings, they can implement a custom exception handler or similar today, so they are not blocked from accomplishing what they want by a new default exception rendering.
This is valid to object to, but I must also point out that e.g. my work project has 20kb of process environment, and in general the process environment is full of secrets a lot of the time in a lot of projects so can be problematic to log. Perhaps it should be split into a separate change. |
I agree, that's why I wrote "probably to UTF-8, because although that can technically break, and an ASCII+hex can't, it's most likely to actually be what's wanted".
Right, this falls under my caveat:
I don't particularly like that use case. If telemetry logs exceptions, why not also log stderr? (And why take the risk of unhandled exceptions in any case? My view is that exception handling should be done in a well-scoped way with a wrapped-IO effect system such as Bluefin or effectful, but sadly that's getting into dreamland ...). But, if people are using
I agree, but what we're debating here is the default. I don't particularly like defaults that work in most cases. I prefer defaults that work in all cases. That's why I use Haskell in the first place. However, I can sense practicality is going to beat out purity in this particular case. So I am strongly leaning to assuming UTF-8 as a solution to #86, and leaving the newline and environment stuff for later discussion.
This PR doesn't address that! Both before and after this PR, if you've used |
The use case is that someone does this in a library and throws an exception at us, and maybe we miss it and have to debug it. The stderr of basically every web app is going into a logger, yes, but it is likely a completely different logger (system logs) than the exceptions, which might, e.g. go into Bugsnag, Honeycomb, etc. If random stuff is printed into stderr, it immediately loses the context of the web request and any other context, and will likely be interleaved with random other logs, so it is going to be nearly impossible to find. I've had to debug glibc assertion failures in prod before, which print into stderr, and finding the stderr message in the haystack of other stuff on the box is really hard. Printing to stderr really does not work except in tiny systems.
The secrets are coming from outside the process, and |
When I said "exception handling should be done in a well-scoped way" I really meant it, it "should" be done in the libraries you use! But, I acknowledge that's unrealistic and people have exception handlers in place to deal with unhandled exceptions from libraries. I'll use the opportunity to soapbox a bit. We're in a terrible local optimum. In this MR it is proposed to improve a function that throws untracked exceptions for convenience. In fact it's convenient because people already have exception handlers in place to catch and log untracked exceptions, so if the function goes wrong the error message already goes to the right place. And the reason people have those handlers in place is because they're useful when calling code that throws untracked exceptions. So we have a horrible vicious cycle: untracked exceptions proliferate because there's already infrastructure to deal with them, and the infrastructure to deal with them proliferates because there are so many untracked exceptions. Anyway, that's other people's problem! I don't have that problem because I use well-scoped exceptions. But I do want to come up with better designs when possible. In the case of this MR, no better design immediately presents itself.
Actually, this is wrong.
But this PR doesn't change behaviour on inherit either! Both before and after this PR, if you inherit your environment you don't see it in the
The only difference is when
After this MR:
Personally I prefer the explicit approach of the status quo. In any case, it will have to be done as a separate change. |
Split off of fpco#83. Before, `ProcessConfig`'s `Show` output would include a trailing newline. This has been fixed, so that derived `Show` output does not include newlines in weird places. Before: ghci> data Foo = Foo { a :: Int, b :: ProcessConfig () () (), c :: String } deriving Show ghci> Foo 1 (proc "echo" ["puppy"]) "doggy" Foo {a = 1, b = Raw command: echo puppy , c = "doggy"} After ghci> Foo 1 (proc "echo" ["puppy"]) "doggy" Foo {a = 1, b = Raw command: echo puppy, c = "doggy"} Whitespace for the `ExitCodeException` `Show` instance has also been adjusted, to place the output closer to the relevant headers. Before: ghci> readProcess_ $ proc "sh" ["-c", "echo this is stdout; echo this is stderr >&2; false"] *** Exception: Received ExitFailure 1 when running Raw command: sh -c "echo this is stdout; echo this is stderr >&2; false" Standard output: this is stdout Standard error: this is stderr After: *** Exception: Received ExitFailure 1 when running Raw command: sh -c "echo this is stdout; echo this is stderr >&2; false" Standard output: this is stdout Standard error: this is stderr Before:
Split off of fpco#83. Before, `ProcessConfig`'s `Show` output would include a trailing newline. This has been fixed, so that derived `Show` output does not include newlines in weird places. Before: ghci> data Foo = Foo { a :: Int, b :: ProcessConfig () () (), c :: String } deriving Show ghci> Foo 1 (proc "echo" ["puppy"]) "doggy" Foo {a = 1, b = Raw command: echo puppy , c = "doggy"} After ghci> Foo 1 (proc "echo" ["puppy"]) "doggy" Foo {a = 1, b = Raw command: echo puppy, c = "doggy"} Whitespace for the `ExitCodeException` `Show` instance has also been adjusted, to place the output closer to the relevant headers. Before: ghci> readProcess_ $ proc "sh" ["-c", "echo this is stdout; echo this is stderr >&2; false"] *** Exception: Received ExitFailure 1 when running Raw command: sh -c "echo this is stdout; echo this is stderr >&2; false" Standard output: this is stdout Standard error: this is stderr After: *** Exception: Received ExitFailure 1 when running Raw command: sh -c "echo this is stdout; echo this is stderr >&2; false" Standard output: this is stdout Standard error: this is stderr Note that because trailing whitespace is not accounted for, it is still possible to get unintuitive results depending on what exactly the subprocess prints: ghci> readProcess_ $ proc "sh" ["-c", "echo -n this is stdout; echo -n this is stderr >&2; false"] *** Exception: Received ExitFailure 1 when running Raw command: sh -c "echo -n this is stdout; echo -n this is stderr >&2; false" Standard output: this is stdout Standard error: this is stderr
Split off of fpco#83. Before, `ProcessConfig`'s `Show` output would include a trailing newline. This has been fixed, so that derived `Show` output does not include newlines in weird places. Before: ghci> data Foo = Foo { a :: Int, b :: ProcessConfig () () (), c :: String } deriving Show ghci> Foo 1 (proc "echo" ["puppy"]) "doggy" Foo {a = 1, b = Raw command: echo puppy , c = "doggy"} After ghci> Foo 1 (proc "echo" ["puppy"]) "doggy" Foo {a = 1, b = Raw command: echo puppy, c = "doggy"} Whitespace for the `ExitCodeException` `Show` instance has also been adjusted, to place the output closer to the relevant headers. Before: ghci> readProcess_ $ proc "sh" ["-c", "echo this is stdout; echo this is stderr >&2; false"] *** Exception: Received ExitFailure 1 when running Raw command: sh -c "echo this is stdout; echo this is stderr >&2; false" Standard output: this is stdout Standard error: this is stderr After: *** Exception: Received ExitFailure 1 when running Raw command: sh -c "echo this is stdout; echo this is stderr >&2; false" Standard output: this is stdout Standard error: this is stderr Note that because trailing whitespace is not accounted for, it is still possible to get unintuitive results depending on what exactly the subprocess prints: ghci> readProcess_ $ proc "sh" ["-c", "echo -n this is stdout; echo -n this is stderr >&2; false"] *** Exception: Received ExitFailure 1 when running Raw command: sh -c "echo -n this is stdout; echo -n this is stderr >&2; false" Standard output: this is stdout Standard error: this is stderr
Between my submission of this PR last August and three days ago, you made no objections to the design of these features as they currently exist, so I'm not sure why you seem to want to block the PR over these concerns now. If you felt so uncomfortable with the design of I will be splitting up this PR in the hope that you will give me more actionable feedback. |
Before, the arrangement of newlines in the
ExitCodeException
Show
instance grouped stdout closer to the stderr header than the stdout header:If there was no trailing newline for the stdout, the output would be formatted with no newline between the end of the stdout and the start of the stderr header:
Now, the output is grouped more consistently and displays nicely regardless of trailing or leading newlines in the output:
The
Show
instance forProcessConfig
has also been touched up, removing edge cases like an empty "Modified environment" header:Extraneous trailing newlines in
Show
instances have also been removed.