Skip to content

std.posix: Added 'error.ProcessNotFound' where necessary #23268

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Apr 14, 2025

Conversation

chrboesch
Copy link
Contributor

If you attempt to read (or other operations) from a process previously opened via openat, but it has since terminated, the Linux kernel throws the error ESRCH - No such process. This is now represented by error.ProcessNotFound.

fixes #19875

@alexrp
Copy link
Member

alexrp commented Mar 16, 2025

I'm not really clear on the conditions under which we get ENOENT vs ESRCH. Can you clarify?

@rootbeer
Copy link
Contributor

(Let me preface this comment by saying, do whatever Alex says, and feel free to ignore me.)

I think this PR correctly maps the errno code on read/write when operating on a /proc/<pid>/ path, where ESRCH means that the process that was previously backing a proc entry is no longer present. For example Linux's fs/proc/base.c:proc_pid_cmdline_read returns ESRCH in this case.

Also since POSIX defines ESRCH as "No such process" this mapping seems relatively stable (even if other kinds of handles like sockets or FUSE or whatnot besides /proc handles might return ESRCH.)

From what I can tell (in the proc/base.c code), ENOENT is used in the proc code when the race lookup is more filesystemy (i.e., during an open or readdir callback, not so much read/write). E.g., in fs/proc/base.c:proc_map_files_readdir. So Zig's conversion to ProcessNotFound may make sense when the open handle points into /proc, but ENOENT could happen in regular directories, too. It is not clear that ProcessNotFound would always be the right conversion.

Doing this mapping at read/write where you can't know what kind of handle is being used (e.g., if its a socket, a FUSE handle, or an open proc handle, etc), seems likely to cause confusion to me. (Also I don't see evidence of Linux using ENOENT in read/write paths in the main proc handlers, but I wouldn't be at all surprised if there are some out there somewhere that do ....)

I think Zig's posix layer can get away with specializing some errno from some syscalls. But I think it'll be hard to do that correctly from syscalls like read or write that can be used on so many different kinds of handles and where the semantics of a specific errno may be specific to the kind of handle.

My suggestion on this PR is that the ESRCH to ProcessNotFound works, but the exisiting mapping of ENOENT (which generally means "file or directory not found") to ProcessNotFound is not something Zig should be doing on syscalls like read and write.

@chrboesch
Copy link
Contributor Author

I generally agree with that. As suggested, "process not found" could of course return "file not found." I'm completely unenthusiastic about this and just want to avoid further confusion, because at that point, you're working with "processes" and not with files. Not from a technical point of view (everything is a file), but from a logical point of view. Personally, I'd stick with "process," because when you're working with it, you think of PID as a process and not a file.

@rootbeer
Copy link
Contributor

(Again, I'm not a Zig team member, so my opinions may be ignored and I may be off-base.) The problem I'm trying to convince you about is that the ENOENT errors are being mapped to error.ProcessNotFound in the std.posix.read (**) function and that function will be used for other things than /proc accesses. I agree "process not found" is nicer than "file not found" when manipulating proc entries, but I'm arguing that its a worse result for handles outside of proc, and thus not a good errno mapping for read to use.

Zig's std.posix.read function should map ENOENT to errors.FileNotFound because that is the direct mapping, and is used elsewhere in Zig's posix layer. Alternatively, in my understanding, the Zig standard practice would be that std.posix.read should not map ENOENT at all (and let it fall into the Unexpected case) because the read syscall should not return ENOENT. As ENOENT generally gets returned from open or getdirents, not read.

My comments here are all about the pre-existing mappings of ENOENT. This current PR just adds mappings from ESRCH to error.ProcessNotFound which seems fine. So technically I think this PR is fine. But I do think the existing errno mapping of ENOENT should get cleaned up. (I think they're relevant because they were added by a previous PR for the same issue.)

(**) The read here stands in for all of read/write/pread/pwrite/preadv/pwritev, they're all the same in terms of errno mapping and the not-found errnos.

@alexrp
Copy link
Member

alexrp commented Mar 24, 2025

I was under the impression that read & co can return ENOENT specifically for files under /proc (e.g. if the process dies between open and read), and that was the motivation for #21430?

@chrboesch
Copy link
Contributor Author

chrboesch commented Mar 24, 2025

Yes, and that hasn't changed. The error message returned by the Linux kernel varies depending on the combination of how a PID is accessed or can no longer be accessed because it no longer exists. We can now catch both possible error types and return them as 'process not found' in Zig, which logically is always the case. Or we can ignore the fact that it's accessing a process and simply write "file not found." I'm still in favor of option 1 and find option 2 clumsy because it's extremely confusing (as Linux often is, unfortunately). We have the option of outputting explanatory error messages, so why not use them?

Addendum: The Linux manual pages also say, "The proc filesystem is a pseudo-filesystem...", so I find the error message "procress not found" much more meaningful than "file not found".

@KilianHanich
Copy link

I actually have a question: What a specific situation/example where read returns ENOENT?

…tencies.

After extensive testing, the error messages for processes that are already open but no longer available for reading/writing have been adjusted to 'ProcessNotFound' if the error code ESRCH is thrown by the Linux kernel.
Incorrect ENOENT error codes that also returned 'ProcessNotFound' have been removed.
@chrboesch
Copy link
Contributor Author

To be honest, I don't remember. I had a test setup consisting of a C program and a Zig program, and I was getting corresponding error codes. Now I tried to recreate it, but to no avail. So I tested everything again from the beginning and only got ESRCH. I've now adjusted this PR accordingly.

@rootbeer
Copy link
Contributor

LGTM!

@KilianHanich
Copy link

KilianHanich commented Apr 3, 2025

commit 92d1276 has a typo in its commit message @chrboesch
same goes for dd4fa72

@chrboesch
Copy link
Contributor Author

Strange, F and P aren't even next to each other on the keyboard...

@chrboesch
Copy link
Contributor Author

@alexrp Is something missing?

@alexrp alexrp self-assigned this Apr 12, 2025
@andrewrk
Copy link
Member

I'm confused. Was ENOENT observed before, or not? Is it now mapped to error.Unexpected after this patch?

@chrboesch
Copy link
Contributor Author

In my first test scenario a few months ago, I was certain that EOENT was occurring. I can't actually reproduce that in the current, somewhat more comprehensive scenario, so it was removed. I've now run several tests in all directions, and all errors that occur are correctly returned. Sorry for confusing, but these tests are unfortunately not automatable.

@alexrp alexrp merged commit 206bd1c into ziglang:master Apr 14, 2025
9 checks passed
@alexrp alexrp removed their assignment Apr 14, 2025
@chrboesch chrboesch deleted the i19875 branch April 14, 2025 20:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ESRCH is not handled by file io operations
5 participants