Skip to content

Add std.os.getFdPath and std.fs.Dir.realpath #5701

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 13, 2020

Conversation

kubkon
Copy link
Member

@kubkon kubkon commented Jun 24, 2020

std.os.realpathat is similar to std.os.realpath, however, it accepts
a pair fd_t, []const u8 of args and thus works out the realpath
of a relative path wrt to some opened file descriptor.

If the input pathname argument turns out to be an absolute path, this
function reverts to calling realpath on that pathname completely
ignoring the input file descriptor. This behaviour is standard in
POSIX and IMHO a good rule of thumb to follow.

If the input file descriptor was obtained using std.fs.cwd() call,
this function reverts to std.os.getcwd() to obtain the file
descriptor's path.

std.fs.Dir.realpath integrates std.os.realpathat with std.fs.Dir
but with dynamic memory allocator.


Part 1 of #5636.


EDIT:

See the comments for an update of the contents of this PR.

@kubkon kubkon changed the title Add std.os.realpathat and std.fs.Dir.realpath [libstd]: add std.os.realpathat and std.fs.Dir.realpath Jun 24, 2020
@andrewrk
Copy link
Member

The Windows failure here looks legit

@kubkon
Copy link
Member Author

kubkon commented Jun 25, 2020

The Windows failure here looks legit

Thanks for having a look! Windows still surprises me at times. Anyhow, I've fixed the tests. I was wondering though, should this be documented somewhere, or is this common knowledge (or at least expected) on Windows (that running std.os.realpath on path that we hold a handle to yields a sharing violation)?

@kubkon
Copy link
Member Author

kubkon commented Jun 26, 2020

@squeek502, since #5684 landed, I've gone ahead and fixed the TODO. :-)

lib/std/fs.zig Outdated
/// relative to this `Dir`. If `pathname` is absolute, ignores this `Dir` handle
/// and returns the canonicalized absolute pathname of `pathname` argument.
/// Caller must free the returned memory.
pub fn realpath(self: Dir, allocator: *Allocator, pathname: []const u8) ![]u8 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following conventions, I think this should be realpathAlloc and there should be a realpath, realpathZ, and realpathW that take a slice as a buffer. See std.fs.realpath and friends:

zig/lib/std/fs.zig

Lines 26 to 29 in bb55889

pub const realpath = os.realpath;
pub const realpathZ = os.realpathZ;
pub const realpathC = @compileError("deprecated: renamed to realpathZ");
pub const realpathW = os.realpathW;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent point, thanks! Will tweak and report back.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@squeek502 I've now added first pass at implementing your suggestion. It'd be great if you could have a look and lemme know if there's any missing, or if something could be improved upon, etc.

Copy link
Collaborator

@squeek502 squeek502 Jun 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't realize that the os.realpath functions use *[MAX_PATH_BYTES]u8 for their out_buffer params. Changing Dir.realpath and friends to use that instead of []u8 would allow for removing the intermediate buffers in their implementations which seems nicer to me. Definitely worth waiting for @andrewrk to chime in on that before changing it though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yeah, that’s correct. It’d be great to change the interface from array to slice in std.os.realpath and friends but my suggestion would be doing it in a subsequent PR. Would that make sense?

@kubkon kubkon force-pushed the dir-realpath branch 2 times, most recently from d331b01 to 36aaff7 Compare June 27, 2020 13:49
@kubkon
Copy link
Member Author

kubkon commented Jul 4, 2020

Just a heads up that #5784 provides a fix for some incorrect handling of relative components such as . and .. in pathname argument to Windows-only os.realpathatW. #5784 also provides tests to that effect.

Copy link
Member

@andrewrk andrewrk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @kubkon.

I want to be careful about how to do this. It looks like realpathat is not a widely portable abstraction that we can provide. Now, that does not mean that we cannot provide it, but it should be very clear, at least from the doc comments, that this is not universally supported API, and will cause compile errors on some operating systems.

Furthermore - this API should be avoided for the cache hash API, because for our purposes there, we do want a widely portable API that works on all OS's.

Does this make sense? I can take a look at the cache hash PR and provide some guidance there too, if that would help.

lib/std/os.zig Outdated
Comment on lines 4014 to 4093
else => {
var procfs_buf: ["/proc/self/fd/-2147483648".len:0]u8 = undefined;
const proc_path = std.fmt.bufPrint(procfs_buf[0..], "/proc/self/fd/{}\x00", .{fd}) catch unreachable;

return readlinkZ(@ptrCast([*:0]const u8, proc_path.ptr), out_buffer);
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is Linux-specific logic so it should go into a .linux => prong. The one that goes into else should be one that looks like it would work for POSIX systems in general, which, in this case, would be the macos prong. However I looked into it a bit and it seems that other operating systems *do not support the concept of realpathat, which makes me want to be very careful about how the std lib goes about this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a compile error on operating systems that don't support this API

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an excellent point, thanks for pointing it out @andrewrk. I'm not sure why I erroneously assumed that for instance, what is true for Linux is also true for BSDs (looking at you here, FreeBSD). All in all, I agree with you here 100%, and I'll make sure to have any other host return a compile error on an attempt at using realpathat. For more context and for anyone joining in in this discussion, FreeBSD is an example of an OS which generally doesn't (officially at least, and in a non-hacky way) support querying the OS for a canonical path of a file descriptor. One excellent source describing this, can be found here. BTW, I'll make sure to add this to the docs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrewrk OK, I've made sure we throw a compile error for all unsupported hosts. In other words, we currently support linux, macos and windows, and for all others, it's a compile error.

While we're here, do you think we should explicitly return a compile error in std.os.realpathat for unsupported hosts as well, or let it drop down to fdPath and the error bubble up from them?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've renamed fdPath to getFdPath and I've tweaked the docs to clearly state that both getFdPath and realpathat are not supported on all hosts.

@kubkon
Copy link
Member Author

kubkon commented Jul 8, 2020

Nice work @kubkon.

I want to be careful about how to do this. It looks like realpathat is not a widely portable abstraction that we can provide. Now, that does not mean that we cannot provide it, but it should be very clear, at least from the doc comments, that this is not universally supported API, and will cause compile errors on some operating systems.

Furthermore - this API should be avoided for the cache hash API, because for our purposes there, we do want a widely portable API that works on all OS's.

Does this make sense? I can take a look at the cache hash PR and provide some guidance there too, if that would help.

Makes sense! So #5636 does provide a couple of bug fixes to realpathat plus provides a WASI-specific implementation, but I'd be happy to merge the changes that deal with realpathat (plus any additional tests) into this PR. Or I could revert changes to the CacheHash struct, and we could move the conversation from here to #5636. Lemme know what works best in your opinion!

@andrewrk
Copy link
Member

andrewrk commented Jul 8, 2020

Lemme know what works best in your opinion!

hmm, if you don't mind, let's focus on this API layer first, and make sure we feel good about it, then move on to cache hash

@kubkon
Copy link
Member Author

kubkon commented Jul 8, 2020

Lemme know what works best in your opinion!

hmm, if you don't mind, let's focus on this API layer first, and make sure we feel good about it, then move on to cache hash

@andrewrk Right, that's what I meant, sorry. Lemme clarify a bit more. So when working on cache hash, I've also added a couple of niceties to realpathat and those landed in the other PR. And what I meant was, I could probably move those changes from the other PR to this one since they deal with realpathat only.

@andrewrk
Copy link
Member

andrewrk commented Jul 8, 2020

I could probably move those changes from the other PR to this one since they deal with realpathat only.

Yeah that sounds good 👍

@kubkon
Copy link
Member Author

kubkon commented Jul 8, 2020

I could probably move those changes from the other PR to this one since they deal with realpathat only.

Yeah that sounds good 👍

Cool, I'll get on it then! Thanks for your patience, and sorry about the confusion! :-)

@kubkon
Copy link
Member Author

kubkon commented Jul 8, 2020

@andrewrk commits migrated! So now, realpathat also work in WASI, although with this twist that the realpath is resolved upto the Dir handle (anything beyond is considered an attempt at escaping the sandbox, and hence erroneous). I've made sure this is documented appropriately. While this may sound like a weird thing, I noticed that realpathat is the closest thing to "canonicalize" in other languages, and we should definitely provide something like this for WASI.

@kubkon
Copy link
Member Author

kubkon commented Jul 9, 2020

The reason for timeout on Linux was a bug in wasi.PreopenList which I've patched up in #5829. I've rebased this PR on top that one, so hopefully the CI should be green again.

@kubkon kubkon added the standard library This issue involves writing Zig code for the standard library. label Aug 10, 2020
@kubkon kubkon marked this pull request as draft August 12, 2020 14:14
@kubkon kubkon force-pushed the dir-realpath branch 2 times, most recently from f0a7510 to fbbd01a Compare August 12, 2020 21:54
@kubkon kubkon changed the title [libstd]: add std.os.realpathat and std.fs.Dir.realpath Add std.os.getFdPath and std.fs.Dir.realpath Aug 12, 2020
@kubkon
Copy link
Member Author

kubkon commented Aug 12, 2020

I've decided to redo this PR in a major way such that:

  • I've now moved the concept of realpath relative to some file descriptor directly to std.fs.Dir structure since it seems to belong there better rather than in std.os which will be renamed to std.posix in the near future. This way, we can safely make this feature very host specific and so for instance FreeBSD is not supported due to the limitation of its kernel.
  • I've decided to forego WASI support for now until a solid use case comes by which would find a sandboxed version of realpath desirable where the returned path is always relative to the root file descriptor.
  • I've left std.os.getFdPath in place since it's a useful refactoring between std.os.realpath and std.fs.Dir.realpath; however, if anyone believes it shouldn't be there, I'll be happy to figure out a way of shuffling things around or removing it entirely.

@andrewrk would you care to take another look at this PR?

@kubkon kubkon marked this pull request as ready for review August 12, 2020 22:03
@kubkon kubkon requested a review from andrewrk August 12, 2020 22:03
`std.os.getFdPath` is very platform-specific and can be used to query
the OS for a canonical path to a file handle. Currently supported hosts
are Linux, macOS and Windows.

`std.fs.Dir.realpath` (and null-terminated, plus WTF16 versions) are
similar to `std.os.realpath`, however, they resolve a path wrt to this
`Dir` instance.

If the input pathname argument turns out to be an absolute path, this
function reverts to calling `realpath` on that pathname completely
ignoring this `Dir`.
Copy link
Member

@andrewrk andrewrk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 👍

@kubkon kubkon merged commit 3e2e6ba into ziglang:master Aug 13, 2020
@kubkon kubkon deleted the dir-realpath branch August 13, 2020 05:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
os-linux os-macos os-windows standard library This issue involves writing Zig code for the standard library.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants