Skip to content

std.process.Child: use clone() instead of fork() #22368

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

ruihe774
Copy link
Contributor

@ruihe774 ruihe774 commented Dec 31, 2024

See discussions in #1620.

This PR depends on #22341. (merged)

@ruihe774 ruihe774 force-pushed the clone branch 2 times, most recently from 0b61b7d to 57ee91d Compare December 31, 2024 04:32
@ruihe774
Copy link
Contributor Author

clone() does not work in .stage2_c. I disabled it in this case.

@ruihe774 ruihe774 force-pushed the clone branch 14 times, most recently from 6f31e98 to 351ce97 Compare December 31, 2024 09:39
@ruihe774
Copy link
Contributor Author

ruihe774 commented Dec 31, 2024

I use clone3() in x86 and x86_64 so that I can use CLONE_CLEAR_SIGHAND if available. I'm not familiar with (and I cannot test) asm of other ISAs. Sorry.

@ruihe774 ruihe774 force-pushed the clone branch 2 times, most recently from 2ac3a31 to 0713379 Compare December 31, 2024 10:02
@ruihe774
Copy link
Contributor Author

@alexrp PTAL.

@ruihe774 ruihe774 force-pushed the clone branch 7 times, most recently from e88bb54 to f57a745 Compare January 4, 2025 02:55
@ruihe774
Copy link
Contributor Author

@alexrp Is #22427 a blockade of this PR?

@alexrp
Copy link
Member

alexrp commented Jan 15, 2025

Probably not. The workaround can be removed once that bug has been fixed.

By the way, I'm not particularly familiar with the actual process management logic in std.process.Child. Someone else should probably review that.

@ruihe774
Copy link
Contributor Author

By the way, I'm not particularly familiar with the actual process management logic in std.process.Child. Someone else should probably review that.

@andrewrk Would you plz have a look?

@NicoElbers
Copy link
Contributor

Hai there, I also wrote an implementation of clone3 (#22616), not knowing about your pr, and I have a question for you. The clone3 syscall itself is simply a fork with much more granularity (as per the man page), however what you seem to have implemented is closer to the clone glibc function using clone3 under the hood. Perhaps I'm misunderstanding, but this is not directly the syscall right?

Additionally, if it's for perfomance reasons I also don't fully understand. Looking at godbolt, the handrolled assembly is certainly better, but I personally don't see how losing 5 or so instructions offsets the additional complexity, but again perhaps I'm misunderstanding.

Would love your feedback :)

@ruihe774
Copy link
Contributor Author

Hai there, I also wrote an implementation of clone3 (#22616), not knowing about your pr, and I have a question for you. The clone3 syscall itself is simply a fork with much more granularity (as per the man page), however what you seem to have implemented is closer to the clone glibc function using clone3 under the hood. Perhaps I'm misunderstanding, but this is not directly the syscall right?

The point is that you must use asm to perform stack switching. Yes, it's probably okay to use clone3 with stack set to null and without CLONE_VM. However, if we want to use CLONE_VM (this is the main performance benefit we can obtain from using clone3), we must set up a stack for child and using an asm wrapper to perform stack switching. We cannot write stack switching in Zig; the compiler has assumption about call frames and will mess it up.

Similar to clone, glibc also has a wrapper of clone3 to perform stack switching, though not exposed: https://github.com/bminor/glibc/blob/master/sysdeps/unix/sysv/linux/x86_64/clone3.S

@ruihe774
Copy link
Contributor Author

Additionally, if it's for perfomance reasons I also don't fully understand. Looking at godbolt, the handrolled assembly is certainly better, but I personally don't see how losing 5 or so instructions offsets the additional complexity, but again perhaps I'm misunderstanding.

It is because you have a mistake in your code. It should be:

const rc = std.os.linux.syscall2(.clone3, @intFromPtr(args), @sizeOf(ClArgsDummy));

instead of:

const rc = std.os.linux.syscall2(.clone3, @intFromPtr(&args), @sizeOf(ClArgsDummy));

@NicoElbers
Copy link
Contributor

I see, thanks a lot for your explanation. Your implementation is much more solid, and I won't have much time in the near future so I will close my PR. Good luck getting this merged :)

@ruihe774
Copy link
Contributor Author

Any updates on this? Can we add this to some milestone? @alexrp @andrewrk

The comment syntax is not supported by both gcc and zig
@ruihe774 ruihe774 requested review from alexrp and andrewrk April 4, 2025 21:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants