Skip to content

path/filepath: add IsLocal #56219

Closed
Closed
@neild

Description

@neild

It is common for programs to accept filenames from untrusted sources. For example, an archive extractor might create files based on names in the archive, or a webserver may serve the content of local files identified by a URL path. In these cases, the program should usually sanitize the untrusted filenames before use. (See #55356 for a list of vulnerabilities caused by using unsanitized archive paths.)

We should provide a simple path sanitization function which accepts an untrusted input and returns something reasonably safe.

// Sanitize returns a representation of path under the current
// directory. If path is absolute or represents a location
// outside the current directory, Sanitize returns an error.
//
// Sanitize calls Clean on the result, but retains a trailing Separator if any.
//
// If a base path is joined to the result of Sanitize with Join,
// the resulting path will be contained within basepath.
//
// Sanitize does not consider symbolic links.
// Symbolic links can cause the sanitized path to reference a location
// outside the current directory.
func Sanitize(path string) (string, error)

Examples:

Sanitize("a") = "a", <nil>
Sanitize("a/b") = "a/b", <nil>
Sanitize("a/b/../c") = "a/c", <nil>
Sanitize("../a") = "", "../a" is outside the current directory
Sanitize("..") = "", ".." is outside the current directory
Sanitize(".") = "", "." is the current directory
Sanitize("/") = "", "/" is absolute
Sanitize("/a") = "", "/a" is absolute
Sanitize("a/") = "a/", <nil>
Sanitize("a/b/") = "a/b/", <nil>

// on Windows
Sanitize("NUL") = "", "NUL" is absolute

https://go.dev/play/p/EDzG8D15Zed contains a sample implementation.

Activity

added this to the Proposal milestone on Oct 13, 2022
self-assigned this
on Oct 13, 2022
robpike

robpike commented on Oct 13, 2022

@robpike
Contributor

I like the idea but when I saw the issue title, it's purpose wasn't at all clear to me from the name "Sanitize". It's not about cleanliness (we already have "Clean") but security. Maybe something as simple as "Safe", but with a name like that perhaps it should consider symbolic links. Perhaps it should anyway. A conundrum.

neild

neild commented on Oct 13, 2022

@neild
ContributorAuthor

I'm not sold on the name, although "Sanitize" does have the connotation of being more than just clean. ("To reduce or eliminate pathogenic agents", says Merriam-Webster.)

Symbolic links are an interesting question. If we do consider them, then we need to know the base path that the untrusted path is relative to. Not all users will want to avoid symbolic links, either; an archive extractor writing to an output file wants to avoid writing through symlinks, but a web server probably wants to follow links in the directory it is serving from. In some cases, there might be a race condition between checking for the presence of a link and writing to the file.

moved this to Incoming in Proposalson Oct 13, 2022
mvdan

mvdan commented on Oct 14, 2022

@mvdan
Member

I lean towards not handling symlinks. Besides the raciness that neild mentions, once the program opens the sanitized path, there's other edge cases that could possibly cause trouble, so it's hard to guarantee complete safety:

  • Linux bind mounts could still lead you to write a file in a parent directory
  • Some filesystems could alter or not support the sanitized filename, e.g. case insensitive filesystems

I think this API would still be safe for a fairly common use case: extracting files from an archive (like a txtar) into a new directory. In that case, you don't have to worry about symlinks.

As for the name: I understand API as "the file path is relative and a child". I agree with Rob that I'm not a big fan of Sanitize, mainly because it doesn't give me that idea. Perhaps BoundedRelative, in the sense that we want a relative path that is "bounded to" the "dot" directory.

bcmills

bcmills commented on Oct 14, 2022

@bcmills
Contributor

I'm not a fan of the implicit “under the current directory” behavior — there are lots of cases where I'd like to restrict a path to be within some other working directory (such as the temporary directory returned by t.TempDir within a particular test function).

I'm also not keen on the “retains a trailing Separator if any” caveat — it's inconsistent with filepath.Abs and filepath.Join, and I don't see a clear need for it.

Perhaps instead, we could have variants of Join and Rel that sanitize their results?

// JoinInDir joins any number of path elements to dir provided that
// the result is lexically still within dir. Empty elements are ignored.
//
// If the resulting path is not lexically within dir, JoinInDir returns a non-nil error.
func JoinInDir(dir, elem ...string) (string, error)

// RelInDir returns a path that is lexically equivalent to targpath when joined to dir,
// provided that targpath is lexically within dir.
// That is, JoinInDir(dir, RelInDir(dir, targpath)) is equivalent to targpath itself.
//
// An error is returned if targpath can't be made relative to basepath, if knowing
// the current working directory would be necessary to compute it, or if
// the resulting path would not be within dir when joined to it.
func RelInDir(dir, targpath string) (string, error)

Then, if needed, something very close to Sanitize could be written in terms of those:

func Sanitize(path string) (string, error) {
	if filepath.IsAbs(path) {
		cwd, err := os.Getwd()
		if _, err := filepath.RelInDir(cwd, path); err != nil {
			return "", fmt.Errorf("%s is absolute", path)
		}
		return filepath.Clean(path)
	}

	return filepath.JoinInDir(".", path)
}
neild

neild commented on Oct 14, 2022

@neild
ContributorAuthor

To restrict a path to some other working directory, you join the sanitized path to that directory:

sanitized, err := filepath.Sanitize(untrusted)
if err != nil { /* ... */ }

// path is guaranteed to be under /base/dir.
path := filepath.Join("/base/dir", sanitized)

One distinction between this and JoinInDir is that the above will return an error if the untrusted path is absolute, while JoinInDir--if consistent with Join--will silently convert absolute paths into relative ones. (JoinInDir("/a", "/b") is presumably "/a/b")

But I agree that it's very common to want to safely append an untrusted path to a trusted one, so perhaps the right operation here is a safer join.

I do think the function name should convey a sense that it's more secure than the alternatives. JoinInDir is descriptive, but doesn't strongly indicate that it's more suitable for use on untrusted inputs than Join. Perhaps SafeJoin?

AndrewHarrisSPU

AndrewHarrisSPU commented on Oct 15, 2022

@AndrewHarrisSPU

For relative paths, maybe there's a tree-reflexive property, where joining a relative path with a prefix only results in targeting the subset of nodes that are the child nodes of the prefix. The mental hiccup is assuming tree-reflexivity when that's not a general property of relative paths.

I think coercing the tree-reflexive property (or an error) from a relative path can be seen as a unary operation on a relative path, or a binary operation on a relative path with nothing.

RefSafe(prefix string, rel string) with the tweaked notion of reflexivity for e.g. RefSafe("", "../") I think would be easy enough to grok. It could handle an empty, relative, or absolute prefix.

Still, ChildSafe() would be fun.

earthboundkid

earthboundkid commented on Oct 20, 2022

@earthboundkid
Contributor

ISTM, most (all?) cases where you would want to use this involve os paths (since an fs.FS is rooted anyway) and joining to a base. I'm not sure when you would use path.Sanitize and not immediately turn around and do filepath.Join. So maybe it should just be filepath.SafeJoin(base string, followSymlinks bool, paths ...string) string.

neild

neild commented on Oct 20, 2022

@neild
ContributorAuthor

SafeJoin seems easier to reason about, and I agree that most cases where you want this are going to join the path to a base anyway.

I'm not sold on the followSymlinks parameter. At worst, this provides a false sense of security and space for TOCTOU bugs; the fact that no symlink exists along that path now says nothing about the future.

AndrewHarrisSPU

AndrewHarrisSPU commented on Oct 20, 2022

@AndrewHarrisSPU

I'm not sure when you would use path.Sanitize and not immediately turn around and do filepath.Join.

I think there are use cases where it's "eventually" or even "never", rather than "immediately". For example, a CLI tool taking a path as an argument might not Join until a non-trivial amount of work is done, but if it's going to fail when Sanitize fails it would be better to reject when parsing arguments.

neild

neild commented on Oct 20, 2022

@neild
ContributorAuthor

Thinking more about symlinks, filename sanitization is the wrong time to check for links. Links need to be checked for atomically at file open time, to avoid TOCTOU bugs. Possibly this indicates that we should have a function or functions in the os package to securely open files without traversing symlinks, but I don't think filepath is the right place for this.

One issue with SafeJoin is that Join treats absolute paths in components after the first as relative paths. I don't think that's the right behavior for handling untrusted paths; an absolute path should be an error, not silently converted to a relative path.

Would it be reasonable for SafeJoin to be inconsistent with Join here?

Join("/a", "/b")     // /a/b
SafeJoin("/a", "/b") // error, /b is absolute

60 remaining items

added a commit that references this issue on Nov 16, 2022
gopherbot

gopherbot commented on Nov 17, 2022

@gopherbot
Contributor

Change https://go.dev/cl/451657 mentions this issue: path/filepath: detect Windows CONIN$ and CONOUT$ paths in IsLocal

neild

neild commented on Nov 18, 2022

@neild
ContributorAuthor

IsLocal will be in 1.20.

modified the milestones: Backlog, Go1.20 on Nov 21, 2022
locked and limited conversation to collaborators on Nov 21, 2023
removed this from Proposalson Dec 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Relationships

None yet

    Development

    No branches or pull requests

      Participants

      @neild@rsc@magical@earthboundkid@jimmyfrasche

      Issue actions

        path/filepath: add IsLocal · Issue #56219 · golang/go