-
-
Notifications
You must be signed in to change notification settings - Fork 2.8k
eliminate posix_spawnp from the standard library #14866
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
91c2a01
to
fd1e922
Compare
Today I found out that posix_spawn is trash. It's actually implemented on top of fork/exec inside of libc (or libSystem in the case of macOS). So, anything posix_spawn can do, we can do better. In particular, what we can do better is handle spawning of child processes that are potentially foreign binaries. If you try to spawn a wasm binary, for example, posix spawn does the following: * Goes ahead and creates a child process. * The child process writes "foo.wasm: foo.wasm: cannot execute binary file" to stderr (yes, it prints the filename twice). * The child process then exits with code 126. This behavior is indistinguishable from the binary being successfully spawned, and then printing to stderr, and exiting with a failure - something that is an extremely common occurrence. Meanwhile, using the lower level fork/exec will simply return ENOEXEC code from the execve syscall (which is mapped to zig error.InvalidExe). The posix_spawn behavior means the zig build runner can't tell the difference between a failure to run a foreign binary, and a binary that did run, but failed in some other fashion. This is unacceptable, because attempting to excecve is the proper way to support things like Rosetta.
fd1e922
to
2406cf5
Compare
I was actually aware of this limitation with |
Also, somewhat tangential, should we completely remove the |
Sure, no problem. If you discover that there truly is no way to disable_aslr and start_suspended with the fork/exec model, then I'm fine with keeping posix_spawn as another option that can be selected via the ChildProcess API. However, I strongly suspect that these options are implemented in posix_spawn on top of fork/exec and that we could do the same here, providing a strictly more powerful API than posix_spawn offers. My reasoning for deleting the wrappers was to discourage using the API since it is, if my hypothesis is correct, a useless API. Worse than useless, in fact, because it serves as a red herring that leads to subtly broken behavior. |
From I gather, it looks like |
It seems Apple supports two implementations for |
I spent the entire day today trying to turn diff --git a/lib/std/c/darwin.zig b/lib/std/c/darwin.zig
index 9c5ac1e93..be36de0ad 100644
--- a/lib/std/c/darwin.zig
+++ b/lib/std/c/darwin.zig
@@ -227,7 +227,7 @@ pub extern "c" fn posix_spawn_file_actions_addinherit_np(actions: *posix_spawn_f
pub extern "c" fn posix_spawn_file_actions_addchdir_np(actions: *posix_spawn_file_actions_t, path: [*:0]const u8) c_int;
pub extern "c" fn posix_spawn_file_actions_addfchdir_np(actions: *posix_spawn_file_actions_t, filedes: fd_t) c_int;
pub extern "c" fn posix_spawn(
- pid: *pid_t,
+ pid: ?*pid_t,
path: [*:0]const u8,
actions: ?*const posix_spawn_file_actions_t,
attr: ?*const posix_spawnattr_t,
@@ -235,7 +235,7 @@ pub extern "c" fn posix_spawn(
env: [*:null]?[*:0]const u8,
) c_int;
pub extern "c" fn posix_spawnp(
- pid: *pid_t,
+ pid: ?*pid_t,
path: [*:0]const u8,
actions: ?*const posix_spawn_file_actions_t,
attr: ?*const posix_spawnattr_t,
diff --git a/lib/std/child_process.zig b/lib/std/child_process.zig
index dba92ab99..aaebe1c5f 100644
--- a/lib/std/child_process.zig
+++ b/lib/std/child_process.zig
@@ -142,11 +142,6 @@ pub const ChildProcess = struct {
if (!std.process.can_spawn) {
@compileError("the target operating system cannot spawn processes");
}
-
- if (comptime builtin.target.isDarwin()) {
- return self.spawnMacos();
- }
-
if (builtin.os.tag == .windows) {
return self.spawnWindows();
} else {
@@ -337,10 +332,11 @@ pub const ChildProcess = struct {
}
fn waitUnwrapped(self: *ChildProcess) !void {
- const res: os.WaitPidResult = if (comptime builtin.target.isDarwin())
- try os.posix_spawn.waitpid(self.id, 0)
- else
- os.waitpid(self.id, 0);
+ // const res: os.WaitPidResult = if (comptime builtin.target.isDarwin())
+ // try os.posix_spawn.waitpid(self.id, 0)
+ // else
+ // os.waitpid(self.id, 0);
+ const res = os.waitpid(self.id, 0);
const status = res.status;
self.cleanupStreams();
self.handleWaitResult(status);
@@ -416,121 +412,6 @@ pub const ChildProcess = struct {
Term{ .Unknown = status };
}
- fn spawnMacos(self: *ChildProcess) SpawnError!void {
- const pipe_flags = if (io.is_async) os.O.NONBLOCK else 0;
- const stdin_pipe = if (self.stdin_behavior == StdIo.Pipe) try os.pipe2(pipe_flags) else undefined;
- errdefer if (self.stdin_behavior == StdIo.Pipe) destroyPipe(stdin_pipe);
-
- const stdout_pipe = if (self.stdout_behavior == StdIo.Pipe) try os.pipe2(pipe_flags) else undefined;
- errdefer if (self.stdout_behavior == StdIo.Pipe) destroyPipe(stdout_pipe);
-
- const stderr_pipe = if (self.stderr_behavior == StdIo.Pipe) try os.pipe2(pipe_flags) else undefined;
- errdefer if (self.stderr_behavior == StdIo.Pipe) destroyPipe(stderr_pipe);
-
- const any_ignore = (self.stdin_behavior == StdIo.Ignore or self.stdout_behavior == StdIo.Ignore or self.stderr_behavior == StdIo.Ignore);
- const dev_null_fd = if (any_ignore)
- os.openZ("/dev/null", os.O.RDWR, 0) catch |err| switch (err) {
- error.PathAlreadyExists => unreachable,
- error.NoSpaceLeft => unreachable,
- error.FileTooBig => unreachable,
- error.DeviceBusy => unreachable,
- error.FileLocksNotSupported => unreachable,
- error.BadPathName => unreachable, // Windows-only
- error.InvalidHandle => unreachable, // WASI-only
- error.WouldBlock => unreachable,
- else => |e| return e,
- }
- else
- undefined;
- defer if (any_ignore) os.close(dev_null_fd);
-
- var attr = try os.posix_spawn.Attr.init();
- defer attr.deinit();
- var flags: u16 = os.darwin.POSIX_SPAWN_SETSIGDEF | os.darwin.POSIX_SPAWN_SETSIGMASK;
- if (self.disable_aslr) {
- flags |= os.darwin._POSIX_SPAWN_DISABLE_ASLR;
- }
- if (self.start_suspended) {
- flags |= os.darwin.POSIX_SPAWN_START_SUSPENDED;
- }
- try attr.set(flags);
-
- var actions = try os.posix_spawn.Actions.init();
- defer actions.deinit();
-
- try setUpChildIoPosixSpawn(self.stdin_behavior, &actions, stdin_pipe, os.STDIN_FILENO, dev_null_fd);
- try setUpChildIoPosixSpawn(self.stdout_behavior, &actions, stdout_pipe, os.STDOUT_FILENO, dev_null_fd);
- try setUpChildIoPosixSpawn(self.stderr_behavior, &actions, stderr_pipe, os.STDERR_FILENO, dev_null_fd);
-
- if (self.cwd_dir) |cwd| {
- try actions.fchdir(cwd.fd);
- } else if (self.cwd) |cwd| {
- try actions.chdir(cwd);
- }
-
- var arena_allocator = std.heap.ArenaAllocator.init(self.allocator);
- defer arena_allocator.deinit();
- const arena = arena_allocator.allocator();
-
- const argv_buf = try arena.allocSentinel(?[*:0]u8, self.argv.len, null);
- for (self.argv, 0..) |arg, i| argv_buf[i] = (try arena.dupeZ(u8, arg)).ptr;
-
- const envp = if (self.env_map) |env_map| m: {
- const envp_buf = try createNullDelimitedEnvMap(arena, env_map);
- break :m envp_buf.ptr;
- } else std.c.environ;
-
- const pid = try os.posix_spawn.spawnp(self.argv[0], actions, attr, argv_buf, envp);
-
- if (self.stdin_behavior == StdIo.Pipe) {
- self.stdin = File{ .handle = stdin_pipe[1] };
- } else {
- self.stdin = null;
- }
- if (self.stdout_behavior == StdIo.Pipe) {
- self.stdout = File{ .handle = stdout_pipe[0] };
- } else {
- self.stdout = null;
- }
- if (self.stderr_behavior == StdIo.Pipe) {
- self.stderr = File{ .handle = stderr_pipe[0] };
- } else {
- self.stderr = null;
- }
-
- self.id = pid;
- self.term = null;
-
- if (self.stdin_behavior == StdIo.Pipe) {
- os.close(stdin_pipe[0]);
- }
- if (self.stdout_behavior == StdIo.Pipe) {
- os.close(stdout_pipe[1]);
- }
- if (self.stderr_behavior == StdIo.Pipe) {
- os.close(stderr_pipe[1]);
- }
- }
-
- fn setUpChildIoPosixSpawn(
- stdio: StdIo,
- actions: *os.posix_spawn.Actions,
- pipe_fd: [2]i32,
- std_fileno: i32,
- dev_null_fd: i32,
- ) !void {
- switch (stdio) {
- .Pipe => {
- const idx: usize = if (std_fileno == 0) 0 else 1;
- try actions.dup2(pipe_fd[idx], std_fileno);
- try actions.close(pipe_fd[1 - idx]);
- },
- .Close => try actions.close(std_fileno),
- .Inherit => {},
- .Ignore => try actions.dup2(dev_null_fd, std_fileno),
- }
- }
-
fn spawnPosix(self: *ChildProcess) SpawnError!void {
const pipe_flags = if (io.is_async) os.O.NONBLOCK else 0;
const stdin_pipe = if (self.stdin_behavior == StdIo.Pipe) try os.pipe2(pipe_flags) else undefined;
@@ -647,11 +528,29 @@ pub const ChildProcess = struct {
os.setreuid(uid, uid) catch |err| forkChildErrReport(err_pipe[1], err);
}
- const err = switch (self.expand_arg0) {
- .expand => os.execvpeZ_expandArg0(.expand, argv_buf.ptr[0].?, argv_buf.ptr, envp),
- .no_expand => os.execvpeZ_expandArg0(.no_expand, argv_buf.ptr[0].?, argv_buf.ptr, envp),
- };
- forkChildErrReport(err_pipe[1], err);
+ if (comptime builtin.target.isDarwin()) {
+ // Use posix_spawn with SETEXEC flag which turns it into execve on steroids
+ // on macOS.
+ var attr = try os.posix_spawn.Attr.init();
+ defer attr.deinit();
+ var flags: u16 = try attr.get();
+ if (self.disable_aslr) {
+ flags |= os.darwin._POSIX_SPAWN_DISABLE_ASLR;
+ }
+ if (self.start_suspended) {
+ flags |= os.darwin.POSIX_SPAWN_START_SUSPENDED;
+ }
+ try attr.set(flags);
+
+ const err = os.posix_spawn.spawnpZ_execve(argv_buf.ptr[0].?, null, attr, argv_buf, envp);
+ forkChildErrReport(err_pipe[1], err);
+ } else {
+ const err = switch (self.expand_arg0) {
+ .expand => os.execvpeZ_expandArg0(.expand, argv_buf.ptr[0].?, argv_buf.ptr, envp),
+ .no_expand => os.execvpeZ_expandArg0(.no_expand, argv_buf.ptr[0].?, argv_buf.ptr, envp),
+ };
+ forkChildErrReport(err_pipe[1], err);
+ }
}
// we are the parent
diff --git a/lib/std/os/posix_spawn.zig b/lib/std/os/posix_spawn.zig
index 32904a942..ecb750586 100644
--- a/lib/std/os/posix_spawn.zig
+++ b/lib/std/os/posix_spawn.zig
@@ -266,6 +266,44 @@ const posix_spawn = if (builtin.target.isDarwin()) struct {
}
}
+ pub fn spawnpZ_execve(
+ file: [*:0]const u8,
+ actions: ?Actions,
+ attr: ?Attr,
+ argv: [*:null]?[*:0]const u8,
+ envp: [*:null]?[*:0]const u8,
+ ) Error {
+ var curr_attr = attr orelse try Attr.init();
+ const flags = try curr_attr.get();
+ try curr_attr.set(flags | os.darwin.POSIX_SPAWN_SETEXEC);
+ switch (errno(system.posix_spawnp(
+ null,
+ file,
+ if (actions) |a| &a.actions else null,
+ &curr_attr.attr,
+ argv,
+ envp,
+ ))) {
+ .SUCCESS => unreachable,
+ .@"2BIG" => return error.TooBig,
+ .NOMEM => return error.SystemResources,
+ .BADF => return error.InvalidFileDescriptor,
+ .ACCES => return error.PermissionDenied,
+ .IO => return error.InputOutput,
+ .LOOP => return error.FileSystem,
+ .NAMETOOLONG => return error.NameTooLong,
+ .NOENT => return error.FileNotFound,
+ .NOEXEC => return error.InvalidExe,
+ .NOTDIR => return error.NotDir,
+ .TXTBSY => return error.FileBusy,
+ .BADARCH => return error.InvalidExe,
+ .BADEXEC => return error.InvalidExe,
+ .FAULT => unreachable,
+ .INVAL => unreachable,
+ else => |err| return unexpectedErrno(err),
+ }
+ }
+
/// Use this version of the `waitpid` wrapper if you spawned your child process using `posix_spawn`
/// or `posix_spawnp` syscalls.
/// See also `std.os.waitpid` for an alternative if your child process was spawned via `fork` and And the example I have been testing it with: const std = @import("std");
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
const allocator = gpa.allocator();
pub fn main() !void {
var args = std.ArrayList([]const u8).init(allocator);
try args.append("./run.wasm");
var child = std.ChildProcess.init(args.items, allocator);
const res = try child.spawnAndWait();
std.debug.print("{}", .{res});
}
|
I also needed a more powerful version of posix_spawn, so I wrote one. If you might be interested in using it, I'd be happy to help improve or extend it. |
Thanks! The problem for us is specifically macOS and support for requesting ASLR off for a child process which currently is looking like it's only supported by |
Update: from my investigation, using lldb's |
I have great news. The actual culprit here is Take a moment to appreciate how depraved this Apple code is: case ENOEXEC:
for (cnt = 0; argv[cnt]; ++cnt)
;
memp = alloca((cnt + 2) * sizeof(char *));
if (memp == NULL) {
/* errno = ENOMEM; XXX override ENOEXEC? */
goto done;
}
memp[0] = "sh";
memp[1] = bp;
bcopy(argv + 1, memp + 2, cnt * sizeof(char *));
err = posix_spawn(pid, _PATH_BSHELL, file_actions, attrp, memp, envp);
goto done; I'm in awe. If the spawning of the child process fails, it allocates on the stack a new argv array, and blindly inserts So the great news is that this code is in libc, and in zig we can simply call This is a great example of why in the Zig standard library we prefer to use NtDll instead of Kernel32, and in general always reach for the lowest level, most powerful syscalls. If you want something done right, you gotta do it yourself. |
Here is the plan forward: I will merge this PR now, because it works, and expect to see our child process implementation switch to using |
Today I found out that posix_spawnp is trash. It's actually implemented on top of posix_spawn inside of libc (or libSystem in the case of macOS).
So, anything posix_spawnp can do, we can do better. In particular, what we can do better is handle spawning of child processes that are potentially foreign binaries. If you try to spawn a wasm binary, for example, posix_spawnp does the following:
This behavior is indistinguishable from the binary being successfully spawned, and then printing to stderr, and exiting with a failure - something that is an extremely common occurrence.
Meanwhile, using the lower level posix_spawn will simply return ENOEXEC code from the execve syscall (which is mapped to zig error.InvalidExe).
The posix_spawnp behavior means the zig build runner can't tell the difference between a failure to run a foreign binary, and a binary that did run, but failed in some other fashion. This is unacceptable, because attempting to execve is the proper way to support things like Rosetta.