Skip to content

using stdio can hit unreachable on systems that are mostly POSIX #2428

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
tgschultz opened this issue May 4, 2019 · 27 comments
Closed

using stdio can hit unreachable on systems that are mostly POSIX #2428

tgschultz opened this issue May 4, 2019 · 27 comments
Labels
standard library This issue involves writing Zig code for the standard library.
Milestone

Comments

@tgschultz
Copy link
Contributor

tgschultz commented May 4, 2019

As submitted by cameris in IRC:

$ sh -c 'exec 3<&1 1>&-; ls /proc/$$/fd >&3; ./main'
0  2  3
reached unreachable code
/usr/lib/zig/std/os.zig:392:28: 0x21c09d in ??? (main)
            posix.EBADF => unreachable, // always a race condition
                           ^
/usr/lib/zig/std/os/file.zig:426:30: 0x205303 in ??? (main)
            try os.posixWrite(self.handle, bytes);
                             ^
/home/cameris/rust/hello_world/zig/main.zig:8:26: 0x2259fb in ??? (main)
    try stdout_file.write("Hello, world!\n");
                         ^
/usr/lib/zig/std/special/bootstrap.zig:122:22: 0x225326 in ??? (main)
            root.main() catch |err| {
                     ^
/usr/lib/zig/std/special/bootstrap.zig:43:5: 0x225091 in ??? (main)
    @noInlineCall(posixCallMainAndExit);
    ^
sh: line 1: 12563 Aborted                 (core dumped) ./main
@tgschultz tgschultz changed the title stdin/out/err should not be assumed to be open on POSIX stdin/out/err can hit unreachable on POSIX May 4, 2019
@tgschultz tgschultz changed the title stdin/out/err can hit unreachable on POSIX writing to stdin/out/err can hit unreachable on POSIX May 4, 2019
@daurnimator
Copy link
Contributor

This is not a bug... POSIX does say that stdin/stdout/stderr are present in new processes. See https://pubs.opengroup.org/onlinepubs/9699919799/functions/stdin.html

At program start-up, three streams shall be predefined and need not be opened explicitly: standard input (for reading conventional input), standard output (for writing conventional output), and standard error (for writing diagnostic output). When opened, the standard error stream is not fully buffered; the standard input and standard output streams are fully buffered if and only if the stream can be determined not to refer to an interactive device.

[CX] [Option Start] The following symbolic values in <unistd.h> define the file descriptors that shall be associated with the C-language stdin, stdout, and stderr when the application is started:

STDIN_FILENO
Standard input value, stdin. Its value is 0.
STDOUT_FILENO
Standard output value, stdout. Its value is 1.
STDERR_FILENO
Standard error value, stderr. Its value is 2.

The stderr stream is expected to be open for reading and writing.

What you're seeing is a place where linux (and other OSes) allow you to break POSIX assurances.

@tgschultz
Copy link
Contributor Author

I contend that it is still a bug because both getStdOut and write can throw errors, and that's the appropriate thing to do instead of panicking. Clearly this unreachable code is not, in fact, unreachable.

@daurnimator
Copy link
Contributor

Ah, I was replying to the original title:

stdin/out/err should not be assumed to be open on POSIX

@andrewrk andrewrk added this to the 0.6.0 milestone May 5, 2019
@andrewrk andrewrk added the standard library This issue involves writing Zig code for the standard library. label May 5, 2019
@andrewrk
Copy link
Member

andrewrk commented May 5, 2019

I don't think it's obvious how (or whether) to solve this.

I agree with @tgschultz that since getStdIn/getStdOut/getStdErr can return an error, if the functions were able to detect that stdin/stdout/stderr were unavailable, they should return an error.

However, EBADF is always a race condition. For example, if the application had called open("foo.txt") a few times and received file descriptors, these fds would be the numbers 0, 1, 2, and getStdIn/getStdOut/getStdErr would return these even though they are not stdin/stdout/stderr.

I also want to note that the error in the original post comes from the write syscall - not during getStdOut.

One possibility is to put code before main() that does syscalls to test stdin/stdout/stderr file descriptors, but that's not desirable for obvious reasons.

@cameris
Copy link

cameris commented May 5, 2019

My question on IRC was, why it does not fail on getStdOut() even so the comment suggests otherwise.
From the POSIX description I'm not sure who even is responsible for opening these. Could even be interpreted as it is up to the language runtime to ensure these are opened.

@andrewrk
Copy link
Member

andrewrk commented May 5, 2019

why it does not fail on getStdOut() even so the comment suggests otherwise

The current reason for the possibility of failure is Windows: https://docs.microsoft.com/en-us/windows/console/getstdhandle

@tgschultz
Copy link
Contributor Author

So getStdOut and co. are actually just to get the handle, which should be 0,1, or 2 on posix systems, therefore it can only fail on Windows currently. getStdOut could make sure the fd is open first, but that still wouldn't totally fix this because the fd could be closed after the program starts too. so I think appropriate place to error is in write, since you're trying to write to a bad file descriptor. I don't really see how EBADF being a race condition makes it acceptable to panic instead.

Also maybe the comment should be changed so it doesn't give the wrong impression?

@daurnimator
Copy link
Contributor

I think the correct fix is to simply change EBADF to a returnable error instead of a panic.

stdin is fd 0. If some other file is opened and gets fd 0, whether via dup2 or because it's the next free file descriptor, then it is stdin.

@shawnl
Copy link
Contributor

shawnl commented May 26, 2019

One possibility is to put code before main() that does syscalls to test stdin/stdout/stderr file descriptors, but that's not desirable for obvious reasons.

glibc does this. It is ugly however.

However, EBADF is always a race condition.

Not with dup2/dup3, in which you can provide the new number (and it will close the existing fd to provide it, so there is no race condition)

@pixelherodev
Copy link
Contributor

Proposal: change the title of this issue to using stdio can hit unreachable on systems that are mostly POSIX.

Attempting to write to stdin makes no sense anyways.

The real issue isn't that Zig is hitting unreachable under POSIX, it's that the POSIX backend - which would never hit that error on a fully POSIX system - can and indeed should run on systems that don't fully meet POSIX, though what to do there isn't so clearcut as "return an error."

@andrewrk andrewrk changed the title writing to stdin/out/err can hit unreachable on POSIX using stdio can hit unreachable on systems that are mostly POSIX Dec 20, 2019
@pixelherodev
Copy link
Contributor

pixelherodev commented Dec 20, 2019

On a more practical note: I'd contend that the current behavior is correct. The only situation under which this error can occur is if you deliberately make the platform not POSIX-compatible. I'm personally not okay with adding OS-specific changes to the POSIX backend, and adding a Linux-specific backend would be a tremendous waste of time and effort.

As is, the produced executable is a POSIX application just as much as a Linux one (referring just to the standard library here). The fact that you then twist the running environment to not be POSIX isn't a problem, but to then expect a POSIX executable to run makes no sense.

If a Windows executable refuses to run under Windows because it contains code that is only valid under WINE, the onus to fix the issue would not be on Zig (though I'm unsure if that's actually ever possible).

@pixelherodev
Copy link
Contributor

I was going to propose a solution, but then a thought occured: if you close a stdio stream, will a future call to open ever return it? More generally, under POSIX, is it possible for something along those lines (stdio is closed and [0..2] no longer refer to stdio) to ever occur? If so, then returning an error here is perfectly valid.

If it's possible under Linux, but not POSIX, then it's a bit trickier. Checking if they're open before calling main and returning that boolean via e.g. std.os.linux.hasStdIn() and panicking in getStdIn() is called might be a good solution, but it's a lot less certain.

@andrewrk andrewrk modified the milestones: 0.6.0, 0.7.0 Dec 20, 2019
@pixelherodev
Copy link
Contributor

More reading: https://stackoverflow.com/a/25516438

@shawnl
Copy link
Contributor

shawnl commented Dec 20, 2019

We have to live with the kernel binaries that exist, not some human-language standards.

@pixelherodev
Copy link
Contributor

Yes, but this isn't a Linux backend, it's a POSIX one. If this is unreachable under, say, OS X (AFAICT it's not) or WASI (no clue), then it makes sense to adhere to the standard and require OS-specific support to prevent you from getting there.

However, there's a more pressing reason that I think the current situation can't stand (though I'm not convinced returning an error is the correct solution):

Since it's possible to close stdio streams, a very important question is raised: if stdio is closed, should we care?

More generally, if someone either redirects it (see OP), or closes it later, and then tries writing to it, what should happen? There are definitely valid arguments to be made for returning an error, panicking, etc.

But here's a "fun" little example that I think pushes me to the side of "return an error."

// Externally defined function which closes stdout and stderr
extern fn close_stdio_streams();

pub fn main() void {
    close_stdio_streams();
    import("std").debug.warn("Hello!");
}

@pixelherodev
Copy link
Contributor

We have to live with the kernel binaries that exist, not some human-language standards.

More importantly, if you deliberately create an environment in which applications won't function correctly, and then they don't, that's on you, not the OS, and not the application.

The real crux of this issue is that it will only ever happen if you want it to. If you close stdio and then try using it, that's your choice, but I'd argue that that's inherently library-level UB, no different from e.g. addition overflow or pointer comparison in C.

@andrewrk
Copy link
Member

I think it's correct for Zig to treat the hard-coded file descriptor values 0, 1, and 2 as stdio. I'm pretty sure this issue is going to end up being closed with working-as-designed.

@pixelherodev
Copy link
Contributor

In that case, does it make sense to have os.close fail/panic if given [0..2]?

@andrewrk
Copy link
Member

Nope, it's the programmer's responsibility to deal with the fact that 0, 1, 2 are aliased with stdio. If the kernel allows applications to close stdio file descriptors and open new ones, the programmer may do it, and if they choose to do this then they have opted into dealing with it.

@pixelherodev
Copy link
Contributor

Yeah, that makes sense.

@andrewrk
Copy link
Member

I'm pretty confident that will be the best approach.

@daurnimator
Copy link
Contributor

Note: POSIX only says that new processes will be started with a std{out/in/err}. Processes are allowed to close them; open new fds on them; dup to them; etc. The main thing is that you should always make sure 0,1 and 2 are open when you exec.

@kyle-github
Copy link

FWIW, I recently had an issue in a C library I created and maintain and in it I was closing a file descriptor which happened to be 0 (zero). It turned out that MQTT (I think, need to look at the bug again) was using this. It was on some IoT platform. Long story short, 0 was not stdin so my bug of closing what should have been an unopened file descriptor closed down the MQTT library. Oops. Now my library relies on a fair amount of POSIX stuff so the platform was POSIX-ish.

Perhaps moot, but it is a data point.

@shawnl
Copy link
Contributor

shawnl commented Dec 21, 2019 via email

@cameris
Copy link

cameris commented Dec 21, 2019

This whole thing shouldn't be about POSIX, it should be about Zig.

If the kernel allows applications to close stdio file descriptors and open new ones, the programmer may do it, and if they choose to do this then they have opted into dealing with it.

Completely agree. But in the case demonstrated, the caller starting the application has closed the file descriptors. I just want to be able to handle this case in my applications.

If the argument is about POSIX, it could be argued that this should be ensured by the kernel. But as far as I know, POSIX deals with the operating system (!= kernel) interface and does not define which part of the OS ensures some capability (https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap01.html):

POSIX.1-2017 defines a standard operating system interface and environment ...

POSIX.1-2017 describes the external characteristics and facilities that are of importance to application developers, rather than the internal construction techniques employed to achieve these capabilities.

If Zig really wants to hold true to the following stated characteristics

Robust - behavior is correct even for edge cases such as out of memory.
Optimal - write programs the best way they can behave and perform.
Reusable - the same code works in many environments which have different constraints.

, there should be a way to deal with this.

Also if Zig wants to replace C, which is an integral part of POSIX specification, Zig should not rely on POSIX features/behaviors and deal with kernel specific characteristics.

@andrewrk
Copy link
Member

I just want to be able to handle this case in my applications.
, there should be a way to deal with this.

Modify your code to deal with the situation and everything works fine. If you don't want to call std.io.getStdIn then don't call it.

@pixelherodev
Copy link
Contributor

I do think a way to detect this would be useful though.

The problem here is that e.g. std.debug.warn will still panic. There is at present no good way of checking if the stdio FDs are valid.

I think that returning an error instead of unreachable might not be ideal, but the fact remains that there are multiple systems on which this is, arguably, correct. Both MacOS and Linux can hit that unreachable.

Furthermore, as has been pointed out both by daurminator and myself, it is valid under POSIX to close them.

As such, I think we should be returning an error here.

Modify your code to deal with the situation and everything works fine. If you don't want to call std.io.getStdIn then don't call it.

The problem is that this isn't a matter of opting in. As Camerin mentioned,

in the case demonstrated, the caller starting the application has closed the file descriptors

If you didn't choose this, you should still be able to deal with it.

@andrewrk andrewrk modified the milestones: 0.7.0, 0.6.0 Feb 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
standard library This issue involves writing Zig code for the standard library.
Projects
None yet
Development

No branches or pull requests

7 participants