Skip to content

Improve ExitCodeException Show instance #83

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

9999years
Copy link
Contributor

Before, the arrangement of newlines in the ExitCodeException Show instance grouped stdout closer to the stderr header than the stdout header:

ghci> readProcess_ $ proc "sh" ["-c", "echo this is stdout; echo this is stderr >&2; false"]
*** Exception: Received ExitFailure 1 when running
Raw command: sh -c "echo this is stdout; echo this is stderr >&2; false"
Standard output:

this is stdout
Standard error:

this is stderr

If there was no trailing newline for the stdout, the output would be formatted with no newline between the end of the stdout and the start of the stderr header:

ghci> readProcess_ $ proc "sh" ["-c", "nix path-info --json nixpkgs#agda && false"]
*** Exception: Received ExitFailure 1 when running
Raw command: sh -c "nix path-info --json nixpkgs#agda && false"
Standard output:

[{"path":"/nix/store/sj2z0h5ywlflqv50dfphwia6p0ij0mlj-agdaWithPackages-2.6.4.3","valid":false}]Standard error:

these 5 paths will be fetched (18.30 MiB download, 133.19 MiB unpacked):
  /nix/store/5q0kb0nqnqcfs7a0ncsjq4fdppwirpxa-Agda-2.6.4.3-bin
  /nix/store/xmximjjnkn0hm4gw7akc9f20ydz6msmk-Agda-2.6.4.3-data
  /nix/store/sj2z0h5ywlflqv50dfphwia6p0ij0mlj-agdaWithPackages-2.6.4.3
  /nix/store/b49sa2q0yb3fd14ppzh6j6rm8vvgr9n6-ghc-9.6.6-with-packages
  /nix/store/vharimf7f2glj4fyhiglzws0qyv4xrry-libraries

Now, the output is grouped more consistently and displays nicely regardless of trailing or leading newlines in the output:

ghci> readProcess_ $ proc "sh" ["-c", "echo this is stdout; echo this is stderr >&2; false"]
*** Exception: Received ExitFailure 1 when running
Raw command: sh -c "echo this is stdout; echo this is stderr >&2; false"

Standard output:
this is stdout

Standard error:
this is stderr

ghci> readProcess_ $ proc "sh" ["-c", "nix path-info --json nixpkgs#agda && false"]
*** Exception: Received ExitFailure 1 when running
Raw command: sh -c "nix path-info --json nixpkgs#agda && false"

Standard output:
[{"path":"/nix/store/sj2z0h5ywlflqv50dfphwia6p0ij0mlj-agdaWithPackages-2.6.4.3","valid":false}]

Standard error:
these 5 paths will be fetched (18.30 MiB download, 133.19 MiB unpacked):
  /nix/store/5q0kb0nqnqcfs7a0ncsjq4fdppwirpxa-Agda-2.6.4.3-bin
  /nix/store/xmximjjnkn0hm4gw7akc9f20ydz6msmk-Agda-2.6.4.3-data
  /nix/store/sj2z0h5ywlflqv50dfphwia6p0ij0mlj-agdaWithPackages-2.6.4.3
  /nix/store/b49sa2q0yb3fd14ppzh6j6rm8vvgr9n6-ghc-9.6.6-with-packages
  /nix/store/vharimf7f2glj4fyhiglzws0qyv4xrry-libraries

The Show instance for ProcessConfig has also been touched up, removing edge cases like an empty "Modified environment" header:

ghci> putStrLn $ show $ setEnv [] $ proc "sh" []
Raw command: sh
Modified environment:

Extraneous trailing newlines in Show instances have also been removed.

@tomjaguarpaw
Copy link
Collaborator

It seem like this uses text but it has not been added to the dependencies. (If you made changes to the .cabal file please note that the changes should be made to packages.yaml and then the .cabal file regenerated (I think using hpack)).

But I don't really understand why we strip anyway. Isn't that misleading? It seems to me it would be better to add a newline after printing the output regardless of whether it also ended with a newline.

@tomjaguarpaw
Copy link
Collaborator

I pushed a commit that, I think, corrects a test (I didn't really understand why that was broken) and another one that does no stripping. Personally I prefer the no stripping behavior, because of the principle of least surprise. However, if you want to add showExitCodeExceptionStripped then that's fine by me.

@tomjaguarpaw
Copy link
Collaborator

(But preferably stripping done using ASCII, not text)

@9999years
Copy link
Contributor Author

@tomjaguarpaw How about only stripping the end of the output? This would normalize tools writing zero, one, or more newlines but keep leading whitespace intact.

Before, the arrangement of newlines in the `ExitCodeException` `Show`
instance grouped stdout closer to the stderr header than the stdout
header:

    ghci> readProcess_ $ proc "sh" ["-c", "echo this is stdout; echo this is stderr >&2; false"]
    *** Exception: Received ExitFailure 1 when running
    Raw command: sh -c "echo this is stdout; echo this is stderr >&2; false"
    Standard output:

    this is stdout
    Standard error:

    this is stderr

If there was no trailing newline for the stdout, the output would be
formatted with no newline between the end of the stdout and the start of
the stderr header:

    ghci> readProcess_ $ proc "sh" ["-c", "nix path-info --json nixpkgs#agda && false"]
    *** Exception: Received ExitFailure 1 when running
    Raw command: sh -c "nix path-info --json nixpkgs#agda && false"
    Standard output:

    [{"path":"/nix/store/sj2z0h5ywlflqv50dfphwia6p0ij0mlj-agdaWithPackages-2.6.4.3","valid":false}]Standard error:

    these 5 paths will be fetched (18.30 MiB download, 133.19 MiB unpacked):
      /nix/store/5q0kb0nqnqcfs7a0ncsjq4fdppwirpxa-Agda-2.6.4.3-bin
      /nix/store/xmximjjnkn0hm4gw7akc9f20ydz6msmk-Agda-2.6.4.3-data
      /nix/store/sj2z0h5ywlflqv50dfphwia6p0ij0mlj-agdaWithPackages-2.6.4.3
      /nix/store/b49sa2q0yb3fd14ppzh6j6rm8vvgr9n6-ghc-9.6.6-with-packages
      /nix/store/vharimf7f2glj4fyhiglzws0qyv4xrry-libraries

Now, the output is grouped more consistently and displays nicely
regardless of trailing or leading newlines in the output:

    ghci> readProcess_ $ proc "sh" ["-c", "echo this is stdout; echo this is stderr >&2; false"]
    *** Exception: Received ExitFailure 1 when running
    Raw command: sh -c "echo this is stdout; echo this is stderr >&2; false"

    Standard output:
    this is stdout

    Standard error:
    this is stderr

    ghci> readProcess_ $ proc "sh" ["-c", "nix path-info --json nixpkgs#agda && false"]
    *** Exception: Received ExitFailure 1 when running
    Raw command: sh -c "nix path-info --json nixpkgs#agda && false"

    Standard output:
    [{"path":"/nix/store/sj2z0h5ywlflqv50dfphwia6p0ij0mlj-agdaWithPackages-2.6.4.3","valid":false}]

    Standard error:
    these 5 paths will be fetched (18.30 MiB download, 133.19 MiB unpacked):
      /nix/store/5q0kb0nqnqcfs7a0ncsjq4fdppwirpxa-Agda-2.6.4.3-bin
      /nix/store/xmximjjnkn0hm4gw7akc9f20ydz6msmk-Agda-2.6.4.3-data
      /nix/store/sj2z0h5ywlflqv50dfphwia6p0ij0mlj-agdaWithPackages-2.6.4.3
      /nix/store/b49sa2q0yb3fd14ppzh6j6rm8vvgr9n6-ghc-9.6.6-with-packages
      /nix/store/vharimf7f2glj4fyhiglzws0qyv4xrry-libraries

The `Show` instance for `ProcessConfig` has also been touched up,
removing edge cases like an empty "Modified environment" header:

    ghci> putStrLn $ show $ setEnv [] $ proc "sh" []
    Raw command: sh
    Modified environment:

Extraneous trailing newlines in `Show` instances have also been
removed.
@9999years 9999years force-pushed the fix-exitcodeexception-show branch 2 times, most recently from 2a56450 to dcadf7f Compare September 7, 2024 00:14
@9999years
Copy link
Contributor Author

Alright, I've pushed a commit to remove the whitespace stripping behavior from the Show instance, but I do think it makes the exceptions harder to read in a lot of cases and less consistent across the board.

Here's the normal case, where standard output ends with a newline. print doesn't expect a Show instance to end with a newline, so it outputs a blank line at the end:

ghci> e stdout stderr = ExitCodeException { ... }
ghci> print $ e "<STDOUT>\n" ""
Received ExitFailure 1 when running
Raw command: echo

Standard output:
<STDOUT>

ghci>

Meanwhile, if we have a command that includes both stdout and stderr and doesn't output a newline at the end of its stdout, the blank line separating the Standard output: and Standard error: sections disappears:

ghci> print $ e "<STDOUT>" "<STDERR>"
Received ExitFailure 1 when running
Raw command: echo

Standard output:
<STDOUT>
Standard error:
<STDERR>

Also, Show instances that end with newlines make values that contain them print quite clumsily (see the line break before the comma here):

ghci> data Foo = Foo { a :: Int, b :: ExitCodeException, c :: String } deriving Show
ghci> Foo 1 (e "<STDOUT>\n" "") "hello"
Foo {a = 1, b = Received ExitFailure 1 when running
Raw command: echo

Standard output:
<STDOUT>
, c = "hello"}

@9999years 9999years force-pushed the fix-exitcodeexception-show branch from dcadf7f to 189aa42 Compare September 7, 2024 00:25
@tomjaguarpaw
Copy link
Collaborator

Thanks. I appreciate this version doesn't do everything you want, but I'm much more comfortable with it, so if you consider it an improvement let's go for it. You are welcome to subsequently continue to advocate for your desired end goal.

However, this still uses text and assumes UTF-8, which I am not comfortable with. I don't understand why it makes this assumption. It's essentially doing decodeUtf8 and then immediately T.unpacking into a String. Why not just keep the L8.unpack?

@9999years
Copy link
Contributor Author

9999years commented Sep 7, 2024

Why not just keep the L8.unpack?

What encoding does L8.unpack presume? It doesn't appear to be documented. Picking an encoding is not optional — the semantics of converting bytes to codepoints needs to be defined! I believe UTF-8 is the most reasonable choice here, and I believe it's much better to use UTF-8 explicitly than whatever L8.unpack does implicitly. UTF-8 firmly won the encoding war, both on the web and on macOS and Linux, where it is the default encoding (and is used by many many programs regardless of locale and encoding settings).

This Show ExitCodeException instance is optimistic — it does not need to always be correct (there will always be niche programs which use different encodings and need specialized logic) but should instead provide the best results for the most cases possible. UTF-8 is (in my opinion) the obvious correct choice here.

To (hopefully) give some weight to my opinion here, I'm a co-author for the L2/21-235 proposal which added the Symbols for Legacy Computing Unicode block.

@9999years
Copy link
Contributor Author

9999years commented Sep 7, 2024

And in fact L8.unpack does do something obviously wrong and mangles any codepoint higher than U+007F, leading to mojibake errors equivalent to decodeLatin1 . encodeUtf8:

ghci> import Data.ByteString.Lazy.Char8 (unpack)
ghci> import Data.Text.Lazy.Encoding (encodeUtf8)
ghci> import Data.Text.Lazy (pack)
ghci> write text = putStrLn $ unpack $ encodeUtf8 $ pack text
ghci> write "café"
café
ghci> write "hello 🥺"
hello �

This is a bug, this is almost always the wrong behavior on any computer newer than the 1990s, and it's easy to fix.

Copy link
Contributor

@sol sol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No comments on the code itself, but some ideas on how to make the tests easier on the eye.

Co-authored-by: Simon Hengel <[email protected]>
@9999years 9999years force-pushed the fix-exitcodeexception-show branch from 56fcb70 to c05ee1f Compare September 9, 2024 16:40
@tomjaguarpaw
Copy link
Collaborator

tomjaguarpaw commented Sep 12, 2024

[Sorry, pressed Enter too early. Comment to follow.]

@tomjaguarpaw
Copy link
Collaborator

in fact L8.unpack does do something obviously wrong

This is a fair point. I take the point that if we're going to choose anything then UTF-8 is the most inclusive choice. The current version doesn't strip terminal control codes, for example!

My personal preference would to be to make the choice explicit by not displaying stdout and stderr in the Show instance at all, and only showing them through specific functions showErrorCodeExceptionUtf8 / showErrorCodeExceptionUtf16 etc.

I went back to look at where the choice of L8.unpack was made, and it was by @snoyberg eight years ago: 84dac77

Since he is the primary maintainer of the repository I'll leave the final call to him. Thank you for your patience so far @9999years.

@9999years
Copy link
Contributor Author

Any updates here?

@snoyberg
Copy link
Member

snoyberg commented Apr 3, 2025

@tomjaguarpaw I don't have any strong opinions here, I'm totally comfortable with you making whichever decision you feel best about.

@tomjaguarpaw
Copy link
Collaborator

I'm finding this all quite confusing, because the library supports a variety of sorts of behaviour, and I don't see any principles on which to judge current behaviour "right" (should be preserved) or "wrong" (needs changing).

I agree that when you use readProcess_ to run a ProcessConfig (for example, created by proc) it seems natural to include stdout and stderr in the ExitCodeException because otherwise you don't get to see them. That seems to be exactly what readProcess_ is designed for.

ghci> readProcess_ $ proc "sh" ["-c", "echo this is stdout; echo this is stderr >&2; false"]
*** Exception: Received ExitFailure 1 when running
Raw command: sh -c "echo this is stdout; echo this is stderr >&2; false"
Standard output:

this is stdout
Standard error:

this is stderr

However, if you run the ProcessConfig using runProcess_ then stdout/stderr are not attached to the ExitCodeException. I guess the reason is that they are already attached to the Haskell process's stdout/stderr.

ghci> runProcess_ $ proc "sh" ["-c", "echo this is stdout; echo this is stderr >&2; false"]
this is stdout
this is stderr
*** Exception: Received ExitFailure 1 when running
Raw command: sh -c "echo this is stdout; echo this is stderr >&2; false"

Furthermore, if you use readProcess or runProcess you don't get an exception at all and you can check the exit code and format the outputs however you like. This would be my suggested workaround.


So I guess if one doesn't like the formatting of the exception in readProcess_ one can just use one of the other functions. Its behaviour of adding the stdin and stdout seems strange to me. I don't plan to change it but I also am not yet convinced to change the display of ExitCodeException to try to improve it. I think the improvement should come from running the process another way.

@9999years, please let me know what you think.

@9999years
Copy link
Contributor Author

@tomjaguarpaw I'm not sure what you're asking. Is there an ambiguity in my previous comments?

There's a couple small changes here:

  • Changing the newlines in the Show instance.
  • Making the whitespace trimming more consistent in the Show instance.
  • Decoding the output as UTF-8 in the Show instance.

I believe I've explained why I believe all three of these changes are good defaults that improve the status quo of the library. In all cases, escape hatches are available (as you've noted) for users that need more specific behavior, e.g. for legacy computing applications that require non-standard character encodings.

This is ultimately a very small change to improve the ergonomics for the majority of users.

@tomjaguarpaw
Copy link
Collaborator

I'm not trying to suggest your previous comments are ambiguous. I'm asking what you think of my analysis in my most recent comment.

I personally don't like the existing design of including stderr and stdout in an exception. I think it's doomed to failure since we don't know the encoding of stderr and stdout (nor whether they are text at all), yet the show method of Show and the displayException method of Exception must make an assumption about encoding. (I'm wary about the consequences of assuming UTF-8 on Windows.)

I also don't particularly see the point of including the ProcessConfig in the ExitCodeException, because we just had the ProcessConfig when we ran the process! If we want to show it, we can show it ourselves. We don't need it to be shown indirectly through the ExitCodeException.

This is all to say, I don't understand the design of readProcess_, and therefore I'm reluctant to change it. If you think its current behaviour is not good and you have some better behaviour to propose then I suggest we discuss adding it under a function with a new name. Then we can "soft deprecate" readProcess_.


Regarding the change to the printing of environment of ProcessConfig, it seem that it no longer distinguishes between an inherited environment and an environment set to be empty. Is that right?

@9999years
Copy link
Contributor Author

I think it's doomed to failure since we don't know the encoding of stderr and stdout (nor whether they are text at all)

This misses the point, in my opinion. As I've said before, ExitCodeException should do the right thing as often as possible. It doesn't need to be perfect. If it's not the correct behavior for a particular use-case, users can implement their own exceptions, display functions, etc.

In reality, writing to stdout/stderr with encodings other than UTF-8 is vanishingly rare.

I'm wary about the consequences of assuming UTF-8 on Windows

This is fair, and if you'd like to default to WTF-16 on Windows I don't have any objections. However, even on Windows, UTF-8 is more and more common these days.

I also don't particularly see the point of including the ProcessConfig in the ExitCodeException, because we just had the ProcessConfig when we ran the process! If we want to show it, we can show it ourselves. We don't need it to be shown indirectly through the ExitCodeException.

Absolutely not! Remember, this patch is about useful defaults. The point is for the error message to be helpful. You do not generally want Haskell programs to fail with "Exit code 1" without any additional information about which process exited with code 1 (or what that process wrote to stdout/stderr to explain why it failed). Users calling these functions should not have to manually attach information about the ProcessConfig every time they run a process.

This is all to say, I don't understand the design of readProcess_, and therefore I'm reluctant to change it.

Frankly, I'm not really sure what there is to understand: it runs a process and throws an exception if the process fails. The exception should include as much information as possible to assist users with debugging failures.

Regarding the change to the printing of environment of ProcessConfig, it seem that it no longer distinguishes between an inherited environment and an environment set to be empty. Is that right?

That's correct. (I'd like to improve the situation with a fix to #82.) Happy to change it if you'd like, but I really just want to get this merged. If you have any concrete objections, please let me know, but given that we've been running in circles for nearly a year I think it's time to merge this PR.

@lf-
Copy link

lf- commented Apr 7, 2025

Absolutely not! Remember, this patch is about useful defaults. The point is for the error message to be helpful. You do not generally want Haskell programs to fail with "Exit code 1" without any additional information about which process exited with code 1 (or what that process wrote to stdout/stderr to explain why it failed). Users calling these functions should not have to manually attach information about the ProcessConfig every time they run a process.

As an example, I regularly get annoyed at Haskell programs like cabal2nix which have exactly this bug of not showing the command that failed, usually in the middle of a CI pipeline or somewhere else that makes sure I have no idea what command failed. Having the low-effort thing have the right behaviour is really important. It shouldn't require strace to find out why a Haskell program is failing to run a command.

@tomjaguarpaw
Copy link
Collaborator

@9999years, I appreciate that from your point of view this PR may look like a obvious slamdunk to improve the library and help users. However, I want to ensure I understand

  1. the use case of the original feature
  2. the problem with the original feature
  3. how the PR solves the problem whilst not compromising the use case

I apologise if these all seem terribly obvious. Regrettably, that they are not obvious to me, and I request your continued patience.

Exhortations to "just get this merged" because "it's time to merge this PR" are likely to have the opposite of the effect you desire. To help set expectations, I am not yet in the "just get this merged" stage of the PR review process, I'm in the "what is the current design even for, and what alternative approach may exist" stage. The purpose, from my point of view, of this back and forth is to make progress on those above three goals. I'm sorry if the back and forth seems not to directly serve your purpose.


ExitCodeException should do the right thing as often as possible. It doesn't need to be perfect

I sort of agree. I think the library "should do the right thing as often as possible" and "need not be perfect". However, I don't see that applies to ExitCodeException, which is just a means to an end. It doesn't even need to exist, per se. It only exists to support particular behaviours. If we can find more reliable ways of supporting the same behaviours we should implement those ways instead. That's what I'd like to explore, that is, maybe there's a better design to be found.

Having the low-effort thing have the right behaviour is really important

I agree, but I don't see why readProcess_ ought to be that low-effort thing. It seems plausible that there's a design that achieves our goals in a better way. I'd like to see if we can find that design. Maybe we can, maybe we can't and adjusting readProcess_ is indeed the best way to proceed. Let's see.


There are three components to this PR:

  1. Decoding assuming UTF-8
  2. No longer displaying the environment if it is set to be empty
  3. Trimming whitespace

Item 1 seems important! Output is garbled for anyone using non-ASCII. Item 2 seems like a step backwards to me. I'm happy to discuss it further, but it seems like it warrants it's own PR or issue. Item 3 seems fine, but I think it should be discussed separately, because otherwise it's just going to get lost amongsts the weeds of 1 and 2.


Now, the continued discussion has been very helpful to me because I can now understand the purpose of readProcess_. It is for writing "happy path" code like this:

do
    (out1, err1) <- readProcess_ ...
    (out2, err2) <- readProcess_ ...
    (out3, err3) <- readProcess_ ...

whilst exiting the Haskell process with an informative error (including the process configuration, stdout and stderr) on subprocess exit failure. It also supports other behaviour: the ExitCodeException can be caught elsewhere, inspected and manipulated. My guess is that this is not the purpose of the original design, but perhaps it is, or even if not, it's a use case we want to continue to support.

The problem with the design is that it's impossible to store stdout and stderr as ByteStrings in the ExitCodeException, without knowing the encoding, and then write them to the terminal later correctly in all cases. That's because the show and displayException methods require to convert, purely, them to Strings.

(Ultimately this is a consequence of Haskell's default exception ecosystem, which is very poor in my opinion, and why I recommend wrapped-IO effect systems such as Bluefin or effectful instead. If we had been using one of those this problem would not have arisen! ExitCodeException could have been an opaque type.)

So if we insist on continuing to use an exception in this way we are forced to choose encoding up front.

  • Current behaviour is a sort of "implicit choice" of encoding through the ByteString unpack function. That's a weird encoding that no one uses and that is terribly unhelpful!
  • We could choose ASCII, and convert non-ASCII to hex decoding or something equally unreadable. That's very sad for anyone using anything beyond ASCII.
  • We could choose UTF-8 (which also supports all ASCII cases) and is going to be the "correct" choice most of the time in practice.

There is an option which allows us to use an exception yet at least have a hope of determining the correct encoding: determine the encoding at the time of throwing the exception! We could inspect the terminal settings and store the encoding in the ExitCodeException. This may work.

If we don't insist on using an exception at all we can take a different approach, for example, we could define readProcess2_ (if we do this let's choose a better name!) with the same type as readProcess_, but on failure, instead of attaching stdout and stderr to the ExitCodeException, it prints them to the terminal right then and there. Then there is no confusion about the choice of encoding! We get the correct encoding automatically. Then we throw the ExitCodeException with the stdout and stderr fields empty. readProcess2_ would have a downside to readProcess_: you can't suppress the printing of stdout and stderr by catching the ExitCodeException. But maybe that's fine. I'd need to hear more from people who are actually using this in practice to be able to decide.

So, the choice before us seems to be between doing either, both, or neither of changing readProcess_ and adding readProcess2_ (with a better name!). If we change readProcess_ we can:

  • Change the up front encoding assumption when show/displayExceptioning an ExitCodeException (probably to UTF-8, because although that can technically break, and an ASCII+hex can't, it's most likely to actually be what's wanted)
  • Determine the encoding at exception throw time, and use it when converting the exception to a String.

I would welcome further thoughts on this analysis (I am not attempting to suggest that anything anyone has written to this point is ambiguous in any way).

@lf-
Copy link

lf- commented Apr 8, 2025

We could inspect the terminal settings and store the encoding in the ExitCodeException

I don't think that's even necessary, I think you could look at LC_CTYPE/LANG/etc of the run process (and thus during exception creation). But that's not any guarantee that's what actually is output into the terminal. However, I have no idea what the encoding conversion situation is like in Haskell, and I especially have no idea if people using these alternate encodings are actually setting the libc encoding settings (I expect often not).

However, this is besides the point. Improving the display to not corrupt it by default in the majority of cases is a strict improvement over the status quo which corrupts it much more often; the display here is for user facing purposes, and it is a rare state of affairs that people are not using UTF-8 in practice. Making this change does not make any new promises that might be broken later when further incremental improvements are done, such as supporting non-UTF-8 encodings. I am pretty sure that e.g. Python and Rust don't bother doing any non-UTF-8 encodings for their equivalent features.

The more likely case of it getting garbled is just processes outputting non-textual output. That just is what it is. Not much to be done about that, besides using a unicode replacement character on invalid characters so it doesn't break anything too badly.

we could define readProcess2_ (if we do this let's choose a better name!) with the same type as readProcess_, but on failure, instead of attaching stdout and stderr to the ExitCodeException, it prints them to the terminal right then and there

This is really not a usable solution if anyone uses it from a web app or a library used by one, since it detaches the terminal output from the exception context. Or, basically anything else that sends exceptions as telemetry off the machine. Having a library unexpectedly write to stderr is highly surprising behaviour and I don't think it's a good option.


If someone cares a lot about non-UTF-8 encodings, they can implement a custom exception handler or similar today, so they are not blocked from accomplishing what they want by a new default exception rendering.


No longer displaying the environment if it is set to be empty

This is valid to object to, but I must also point out that e.g. my work project has 20kb of process environment, and in general the process environment is full of secrets a lot of the time in a lot of projects so can be problematic to log. Perhaps it should be split into a separate change.

@tomjaguarpaw
Copy link
Collaborator

However, this is besides the point. Improving the display to not corrupt it by default in the majority of cases is a strict improvement over the status quo

I agree, that's why I wrote "probably to UTF-8, because although that can technically break, and an ASCII+hex can't, it's most likely to actually be what's wanted".

readProcess2_ ... instead of attaching stdout and stderr to the ExitCodeException, it prints them to the terminal right then and there

This is really not a usable solution ... since it detaches the terminal output from the exception context

Right, this falls under my caveat:

the ExitCodeException can be caught elsewhere, inspected and manipulated. My guess is that this is not the purpose of the original design, but perhaps it is, or even if not, it's a use case we want to continue to support

I don't particularly like that use case. If telemetry logs exceptions, why not also log stderr? (And why take the risk of unhandled exceptions in any case? My view is that exception handling should be done in a well-scoped way with a wrapped-IO effect system such as Bluefin or effectful, but sadly that's getting into dreamland ...). But, if people are using readProcess_ like that then so be it.

If someone cares a lot about non-UTF-8 encodings, they can implement a custom exception handler or similar today, so they are not blocked from accomplishing what they want by a new default exception rendering.

I agree, but what we're debating here is the default. I don't particularly like defaults that work in most cases. I prefer defaults that work in all cases. That's why I use Haskell in the first place. However, I can sense practicality is going to beat out purity in this particular case.


So I am strongly leaning to assuming UTF-8 as a solution to #86, and leaving the newline and environment stuff for later discussion.


No longer displaying the environment if it is set to be empty

This is valid to object to, but I must also point out that e.g. my work project has 20kb of process environment, and in general the process environment is full of secrets a lot of the time in a lot of projects so can be problematic to log. Perhaps it should be split into a separate change.

This PR doesn't address that! Both before and after this PR, if you've used setEnv to put secrets in the environment, readProcess_ will dump them on process failure.

@lf-
Copy link

lf- commented Apr 8, 2025

I don't particularly like that use case. If telemetry logs exceptions, why not also log stderr? (And why take the risk of unhandled exceptions in any case? My view is that exception handling should be done in a well-scoped way with a wrapped-IO effect system such as Bluefin or effectful, but sadly that's getting into dreamland ...). But, if people are using readProcess_ like that then so be it.

The use case is that someone does this in a library and throws an exception at us, and maybe we miss it and have to debug it. The stderr of basically every web app is going into a logger, yes, but it is likely a completely different logger (system logs) than the exceptions, which might, e.g. go into Bugsnag, Honeycomb, etc. If random stuff is printed into stderr, it immediately loses the context of the web request and any other context, and will likely be interleaved with random other logs, so it is going to be nearly impossible to find.

I've had to debug glibc assertion failures in prod before, which print into stderr, and finding the stderr message in the haystack of other stuff on the box is really hard. Printing to stderr really does not work except in tiny systems.

This PR doesn't address that! Both before and after this PR, if you've used setEnv to put secrets in the environment, readProcess_ will dump them on process failure.

The secrets are coming from outside the process, and setEnv is not very frequently used, so not printing an unchanged env results in not logging random secrets or random 20kb of /nix/store paths. In the most common case, neither secrets nor 20 kilobytes of uninteresting stuff are printed.

@tomjaguarpaw
Copy link
Collaborator

The use case is that someone does this in a library and throws an exception at us, and maybe we miss it and have to debug it

When I said "exception handling should be done in a well-scoped way" I really meant it, it "should" be done in the libraries you use! But, I acknowledge that's unrealistic and people have exception handlers in place to deal with unhandled exceptions from libraries.

I'll use the opportunity to soapbox a bit. We're in a terrible local optimum. In this MR it is proposed to improve a function that throws untracked exceptions for convenience. In fact it's convenient because people already have exception handlers in place to catch and log untracked exceptions, so if the function goes wrong the error message already goes to the right place. And the reason people have those handlers in place is because they're useful when calling code that throws untracked exceptions. So we have a horrible vicious cycle: untracked exceptions proliferate because there's already infrastructure to deal with them, and the infrastructure to deal with them proliferates because there are so many untracked exceptions.

Anyway, that's other people's problem! I don't have that problem because I use well-scoped exceptions. But I do want to come up with better designs when possible. In the case of this MR, no better design immediately presents itself.


This PR doesn't address that! Both before and after this PR, if you've used setEnv to put secrets in the environment, readProcess_ will dump them on process failure.

Actually, this is wrong. readProcess_ (or rather showing the ExitCodeException) explicitly suppresses the environment. It's only showing the ProcessConfig that incorporates the environment.

The secrets are coming from outside the process, and setEnv is not very frequently used, so not printing an unchanged env results in not logging random secrets or random 20kb of /nix/store paths. In the most common case, neither secrets nor 20 kilobytes of uninteresting stuff are printed.

But this PR doesn't change behaviour on inherit either! Both before and after this PR, if you inherit your environment you don't see it in the ProcessConfig. For example:

ghci> print (shell "noexist") -- Inheriting env
Shell command: noexist

The only difference is when setEnv of an empty environment is used. Before this MR:

ghci> print (setEnv [] $ shell "noexist")
Shell command: noexist
Modified environment:

After this MR:

ghci> print (setEnv [] $ shell "noexist")
Shell command: noexist

Personally I prefer the explicit approach of the status quo. In any case, it will have to be done as a separate change.

@9999years 9999years closed this Apr 8, 2025
@9999years 9999years deleted the fix-exitcodeexception-show branch April 8, 2025 19:22
9999years added a commit to 9999years/typed-process that referenced this pull request Apr 8, 2025
Split off of fpco#83.

Before, `ProcessConfig`'s `Show` output would include a trailing
newline. This has been fixed, so that derived `Show` output does not
include newlines in weird places.

Before:

    ghci> data Foo = Foo { a :: Int, b :: ProcessConfig () () (), c :: String } deriving Show
    ghci> Foo 1 (proc "echo" ["puppy"]) "doggy"
    Foo {a = 1, b = Raw command: echo puppy
    , c = "doggy"}

After

    ghci> Foo 1 (proc "echo" ["puppy"]) "doggy"
    Foo {a = 1, b = Raw command: echo puppy, c = "doggy"}

Whitespace for the `ExitCodeException` `Show` instance has also been
adjusted, to place the output closer to the relevant headers.

Before:

    ghci> readProcess_ $ proc "sh" ["-c", "echo this is stdout; echo this is stderr >&2; false"]
    *** Exception: Received ExitFailure 1 when running
    Raw command: sh -c "echo this is stdout; echo this is stderr >&2; false"
    Standard output:

    this is stdout
    Standard error:

    this is stderr

After:

    *** Exception: Received ExitFailure 1 when running
    Raw command: sh -c "echo this is stdout; echo this is stderr >&2; false"

    Standard output:
    this is stdout

    Standard error:
    this is stderr

Before:
9999years added a commit to 9999years/typed-process that referenced this pull request Apr 8, 2025
Split off of fpco#83.

Before, `ProcessConfig`'s `Show` output would include a trailing
newline. This has been fixed, so that derived `Show` output does not
include newlines in weird places.

Before:

    ghci> data Foo = Foo { a :: Int, b :: ProcessConfig () () (), c :: String } deriving Show
    ghci> Foo 1 (proc "echo" ["puppy"]) "doggy"
    Foo {a = 1, b = Raw command: echo puppy
    , c = "doggy"}

After

    ghci> Foo 1 (proc "echo" ["puppy"]) "doggy"
    Foo {a = 1, b = Raw command: echo puppy, c = "doggy"}

Whitespace for the `ExitCodeException` `Show` instance has also been
adjusted, to place the output closer to the relevant headers.

Before:

    ghci> readProcess_ $ proc "sh" ["-c", "echo this is stdout; echo this is stderr >&2; false"]
    *** Exception: Received ExitFailure 1 when running
    Raw command: sh -c "echo this is stdout; echo this is stderr >&2; false"
    Standard output:

    this is stdout
    Standard error:

    this is stderr

After:

    *** Exception: Received ExitFailure 1 when running
    Raw command: sh -c "echo this is stdout; echo this is stderr >&2; false"

    Standard output:
    this is stdout

    Standard error:
    this is stderr

Note that because trailing whitespace is not accounted for, it is still
possible to get unintuitive results depending on what exactly the
subprocess prints:

    ghci> readProcess_ $ proc "sh" ["-c", "echo -n this is stdout; echo -n this is stderr >&2; false"]
    *** Exception: Received ExitFailure 1 when running
    Raw command: sh -c "echo -n this is stdout; echo -n this is stderr >&2; false"

    Standard output:
    this is stdout
    Standard error:
    this is stderr
9999years added a commit to 9999years/typed-process that referenced this pull request Apr 8, 2025
Split off of fpco#83.

Before, `ProcessConfig`'s `Show` output would include a trailing
newline. This has been fixed, so that derived `Show` output does not
include newlines in weird places.

Before:

    ghci> data Foo = Foo { a :: Int, b :: ProcessConfig () () (), c :: String } deriving Show
    ghci> Foo 1 (proc "echo" ["puppy"]) "doggy"
    Foo {a = 1, b = Raw command: echo puppy
    , c = "doggy"}

After

    ghci> Foo 1 (proc "echo" ["puppy"]) "doggy"
    Foo {a = 1, b = Raw command: echo puppy, c = "doggy"}

Whitespace for the `ExitCodeException` `Show` instance has also been
adjusted, to place the output closer to the relevant headers.

Before:

    ghci> readProcess_ $ proc "sh" ["-c", "echo this is stdout; echo this is stderr >&2; false"]
    *** Exception: Received ExitFailure 1 when running
    Raw command: sh -c "echo this is stdout; echo this is stderr >&2; false"
    Standard output:

    this is stdout
    Standard error:

    this is stderr

After:

    *** Exception: Received ExitFailure 1 when running
    Raw command: sh -c "echo this is stdout; echo this is stderr >&2; false"

    Standard output:
    this is stdout

    Standard error:
    this is stderr

Note that because trailing whitespace is not accounted for, it is still
possible to get unintuitive results depending on what exactly the
subprocess prints:

    ghci> readProcess_ $ proc "sh" ["-c", "echo -n this is stdout; echo -n this is stderr >&2; false"]
    *** Exception: Received ExitFailure 1 when running
    Raw command: sh -c "echo -n this is stdout; echo -n this is stderr >&2; false"

    Standard output:
    this is stdout
    Standard error:
    this is stderr
@9999years
Copy link
Contributor Author

Between my submission of this PR last August and three days ago, you made no objections to the design of these features as they currently exist, so I'm not sure why you seem to want to block the PR over these concerns now. If you felt so uncomfortable with the design of runProcess_ and ExitCodeException, I would think your first comments would indicate that, rather than diving into the minutia of correctly declaring a text dependency!

I will be splitting up this PR in the hope that you will give me more actionable feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants