Skip to content

symlinks and ".." directories and platforms oh my! #7751

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
marler8997 opened this issue Jan 11, 2021 · 8 comments
Open

symlinks and ".." directories and platforms oh my! #7751

marler8997 opened this issue Jan 11, 2021 · 8 comments
Labels
standard library This issue involves writing Zig code for the standard library.
Milestone

Comments

@marler8997
Copy link
Contributor

marler8997 commented Jan 11, 2021

There is a divergence between Windows and other platforms on the behavior of .. when it comes to symlinks. On Windows, .. paths are resolved before resolving symlinks, but on other platforms they are resolved in the opposite order. This change in ordering yields different behavior as I'll demonstrate with an example:

Linux:

mkdir -p a/b
ln -s a/b mylink
touch a/file-in-a a/b/file-in-b
ls mylink/..

The behavior in question is what directory does mylink/.. resolve to? Given no other knowledge, mylink/.. would seem to resolve to the current directory ., however, since mylink is a symlink to a/b, it first resolves to a/b/.. which then becomes a. This is indeed what happens on linux as the final command in this script, ls mylink/.., will print the contents of directory a rather than the current working directory ..

The following shows how to execute the same example on Windows:

NOTE: the mklink command requires running your command prompt as Admin

mkdir a\b
mklink /d mylink a\b
echo > a\file-in-a
echo > a\b\file-in-b
dir mylink\..

In this case, the last command dir mylink\.. will not print the contents of directory a, but instead prints the contents of the current working directory .. This is because on Windows, .. directories are resolved without considering whether or not their parent directory is a symlink, which causes those symlinks to be removed from the path without resolving them.

So the question is, what should Zig's std.fs module do with .. directories? Here are the options I see:

  1. have std.fs conform to the platform's conventional behavior regardless of whether it is common between all platforms
  2. converge the behavior on each platform to a common design
  3. implement both a common behavior and the conventional platform behavior and provide an interface to select one or the other

The other question is if we provide a common behavior, do we select the Posix behavior or Windows? @andrewrk suggests the Posix behavior be selected because it is the more common behavior given that symlinks were added relatively recently on Windows and require Admin privileges to create. (see his comment on IRC: https://freenode.irclog.whitequark.org/zig/2021-01-09#28791306)

For now @andrewrk has suggested we implement the platform's conventional behavior and be sure to document this Windows divergence from other platforms. This issue will track any updates to this decision and serves as a placeholder to revisit this decision at a later date.

@mikdusan
Copy link
Member

mikdusan commented Jan 12, 2021

why is this distinction of behaviour important? Is it for some kind of pathname normalizing/sanitizing function?

@marler8997
Copy link
Contributor Author

marler8997 commented Jan 12, 2021

@mikdusan, I'm not sure I understand your question exactly, but the reason we have to pick a behavior in the standard library is because we are calling Nt* functions directly on windows, which do not resolve ".." paths. Furthermore, the function that does the ".." resolution requires allocation on the Windows process heap, and @andrewrk has stated he doesn't want to depend on heap allocation to resolve filenames. This means Zig has to implement the path resolution itself, so we have a choice of doing it the "Windows" way, or the "NonWindows" way. The question is, which method should we implement?

@marler8997
Copy link
Contributor Author

marler8997 commented Jan 12, 2021

Anothing thing to consider is that if we want to adopt the Posix behavior on Windows, that will mean any function outside of ntdll that takes filenames will be "off limits" because they will be resolving symlinks through ntdll which will be using the Windows behavior. This could make the change unfeasible depending on how much of the win32 api this cuts us off from.

EDIT: actually they won't be "off limits" because I think the standard library could sanitize filepaths by resolving ".." parts before passing them on to non ntdll functions

@mikdusan
Copy link
Member

I can't speak to the details but it's best to preserve paths and just pass them to the operating system untouched.

If Nt* functions need ".." resolved before hand, then I would think do it the Windows compatible way but only when being passed to that API that requires it. What confuses me is why would we ever effect linux/macos/etc with this?

Sorry if I misunderstand...

@marler8997
Copy link
Contributor Author

@mikdusan, the behavior you're describing is option 1. This issue was created to consider and discuss implementing the posix behavior on Windows as well (i.e. options 2 and 3).

@mrakh
Copy link
Contributor

mrakh commented Jan 12, 2021

I believe option 1 is best. Zig should treat file paths like any other string - nothing more than a bunch of bytes whose meaning is dependent on the application interpreting it. In this case, that means deferring file path resolution to the operating system's filesystem implementation. It's the simplest to implement, and it avoids subverting any programmer's understanding of the operating system they're working with.

But enough of me waxing philosophy. The practical problem with option 2 is that its implementation requires additional OS calls to check for and resolve symbolic links, which opens the unwitting programmer up to TOCTTOU race conditions:

D:\
    my_symlink -> foo
    foo\
        dummy1.txt
    my_other_symlink -> bar
    bar\
        dummy2.txt
    quux.txt

std.fs.openFileAbsolute("D:\my_symlink\..\my_other_symlink\..\quux.txt")
1. Check if D:\my_symlink is a symbolic link
2. If so, substitute "D:\my_symlink" with "D:\foo" to get "D:\foo\..\my_other_symlink\..\quux.txt"
3. Check if "D:\foo\..\my_other_symlink" is a symbolic link
4. If so, substitute "D:\foo\..\my_other_symlink" with "D:\bar" to get "D:\bar\..\quux.txt"
5. Open "D:\bar\..\quux.txt"

What happens when a process silently updates my_symlink to point to some other folder between steps 1 and 2? Or likewise with my_other_symlink between steps 3 and 4?

@squeek502
Copy link
Collaborator

Somewhat of a duplicate of #4658

Relevant comment:

Right now, the status quo is:

  • On Linux, std.fs.realpath resolves symlinks before ... This matches the system behavior as far as I can tell (cat link/../file will output the contents of linked/../file and fail if it doesn't exist), although there seem to be some edge cases (from my testing, if link is a symlink to linked, then cd link/../dir takes you to ./dir if it exists [ignoring the symlink], otherwise it will take you to linked/../dir [resolving the symlink before ..]).

  • On Windows, std.fs.realpath resolves .. before symlinks. This matches the system behavior as far as I can tell (cd link\..\dir will never take you to linked\..\dir; if .\dir does not exist, it will fail with "The system cannot find the path specified.". Same deal with type link\..\file). Please correct me if I'm wrong on this, but I'm not even sure if there exists a function in the Windows API that resolves symlinks before ... daurnimator has mentioned that Zig might need something like RtlDosPathNameToRelativeNtPathName_U_WithStatus but from using the test code at the bottom of this article, that function does not resolve symlinks before .. either. If there is precedence for Linux-like symlink resolution on Windows, it would be helpful to get that information added to std.fs.realpath bugs/inconsistencies on Windows #4658

@SpexGuy SpexGuy added the standard library This issue involves writing Zig code for the standard library. label Mar 17, 2021
@Vexu Vexu added this to the 0.8.0 milestone Mar 19, 2021
@andrewrk andrewrk modified the milestones: 0.8.0, 0.9.0 Jun 4, 2021
@andrewrk andrewrk modified the milestones: 0.9.0, 0.11.0 Nov 24, 2021
@matu3ba
Copy link
Contributor

matu3ba commented Mar 10, 2022

The use case of junctions is not included, which would however complicate the picture
https://superuser.com/questions/343074/directory-junction-vs-directory-symbolic-link

For Junction Links, we have to use the Absolute path for defining the target path. No relative path works in Junction Links.
Junction Links does not work with files. But only for directory/folder.
Junction Links only works for the local directory but not for a remote directory.
You don't need any admin rights to create Junction links between directories.
The original directory is not affected when junction link points are removed or deleted.
When the target / original directory is removed, deleted, or moved, the junction point remains but continues to point to the non-existent original directory.

source
However, if Zig wants to support generating "windows links for folders without privileged access", one can only use Junctions.

Personally, I think there is no good default as the file system is not always under programmer control and the best one can do is to provide checking and sanitizing methods to ensure stuff happens only locally in programmer specified folders (with absolute paths).

The portable method would then simply not resolve symbolic links.

BTW: Plan9 position is that symlinks are a hack.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
standard library This issue involves writing Zig code for the standard library.
Projects
None yet
Development

No branches or pull requests

8 participants