Closed
Description
It is common for programs to accept filenames from untrusted sources. For example, an archive extractor might create files based on names in the archive, or a webserver may serve the content of local files identified by a URL path. In these cases, the program should usually sanitize the untrusted filenames before use. (See #55356 for a list of vulnerabilities caused by using unsanitized archive paths.)
We should provide a simple path sanitization function which accepts an untrusted input and returns something reasonably safe.
// Sanitize returns a representation of path under the current
// directory. If path is absolute or represents a location
// outside the current directory, Sanitize returns an error.
//
// Sanitize calls Clean on the result, but retains a trailing Separator if any.
//
// If a base path is joined to the result of Sanitize with Join,
// the resulting path will be contained within basepath.
//
// Sanitize does not consider symbolic links.
// Symbolic links can cause the sanitized path to reference a location
// outside the current directory.
func Sanitize(path string) (string, error)
Examples:
Sanitize("a") = "a", <nil>
Sanitize("a/b") = "a/b", <nil>
Sanitize("a/b/../c") = "a/c", <nil>
Sanitize("../a") = "", "../a" is outside the current directory
Sanitize("..") = "", ".." is outside the current directory
Sanitize(".") = "", "." is the current directory
Sanitize("/") = "", "/" is absolute
Sanitize("/a") = "", "/a" is absolute
Sanitize("a/") = "a/", <nil>
Sanitize("a/b/") = "a/b/", <nil>
// on Windows
Sanitize("NUL") = "", "NUL" is absolute
https://go.dev/play/p/EDzG8D15Zed contains a sample implementation.
Metadata
Metadata
Assignees
Type
Projects
Relationships
Development
No branches or pull requests
Activity
robpike commentedon Oct 13, 2022
I like the idea but when I saw the issue title, it's purpose wasn't at all clear to me from the name "Sanitize". It's not about cleanliness (we already have "Clean") but security. Maybe something as simple as "Safe", but with a name like that perhaps it should consider symbolic links. Perhaps it should anyway. A conundrum.
neild commentedon Oct 13, 2022
I'm not sold on the name, although "Sanitize" does have the connotation of being more than just clean. ("To reduce or eliminate pathogenic agents", says Merriam-Webster.)
Symbolic links are an interesting question. If we do consider them, then we need to know the base path that the untrusted path is relative to. Not all users will want to avoid symbolic links, either; an archive extractor writing to an output file wants to avoid writing through symlinks, but a web server probably wants to follow links in the directory it is serving from. In some cases, there might be a race condition between checking for the presence of a link and writing to the file.
mvdan commentedon Oct 14, 2022
I lean towards not handling symlinks. Besides the raciness that neild mentions, once the program opens the sanitized path, there's other edge cases that could possibly cause trouble, so it's hard to guarantee complete safety:
I think this API would still be safe for a fairly common use case: extracting files from an archive (like a txtar) into a new directory. In that case, you don't have to worry about symlinks.
As for the name: I understand API as "the file path is relative and a child". I agree with Rob that I'm not a big fan of
Sanitize
, mainly because it doesn't give me that idea. PerhapsBoundedRelative
, in the sense that we want a relative path that is "bounded to" the "dot" directory.bcmills commentedon Oct 14, 2022
I'm not a fan of the implicit “under the current directory” behavior — there are lots of cases where I'd like to restrict a path to be within some other working directory (such as the temporary directory returned by
t.TempDir
within a particular test function).I'm also not keen on the “retains a trailing Separator if any” caveat — it's inconsistent with
filepath.Abs
andfilepath.Join
, and I don't see a clear need for it.Perhaps instead, we could have variants of
Join
andRel
that sanitize their results?Then, if needed, something very close to
Sanitize
could be written in terms of those:neild commentedon Oct 14, 2022
To restrict a path to some other working directory, you join the sanitized path to that directory:
One distinction between this and
JoinInDir
is that the above will return an error if the untrusted path is absolute, whileJoinInDir
--if consistent withJoin
--will silently convert absolute paths into relative ones. (JoinInDir("/a", "/b")
is presumably"/a/b"
)But I agree that it's very common to want to safely append an untrusted path to a trusted one, so perhaps the right operation here is a safer join.
I do think the function name should convey a sense that it's more secure than the alternatives.
JoinInDir
is descriptive, but doesn't strongly indicate that it's more suitable for use on untrusted inputs thanJoin
. PerhapsSafeJoin
?AndrewHarrisSPU commentedon Oct 15, 2022
For relative paths, maybe there's a tree-reflexive property, where joining a relative path with a prefix only results in targeting the subset of nodes that are the child nodes of the prefix. The mental hiccup is assuming tree-reflexivity when that's not a general property of relative paths.
I think coercing the tree-reflexive property (or an error) from a relative path can be seen as a unary operation on a relative path, or a binary operation on a relative path with nothing.
RefSafe(prefix string, rel string)
with the tweaked notion of reflexivity for e.g.RefSafe("", "../")
I think would be easy enough to grok. It could handle an empty, relative, or absolute prefix.Still,
ChildSafe()
would be fun.earthboundkid commentedon Oct 20, 2022
ISTM, most (all?) cases where you would want to use this involve os paths (since an fs.FS is rooted anyway) and joining to a base. I'm not sure when you would use path.Sanitize and not immediately turn around and do filepath.Join. So maybe it should just be
filepath.SafeJoin(base string, followSymlinks bool, paths ...string) string
.neild commentedon Oct 20, 2022
SafeJoin
seems easier to reason about, and I agree that most cases where you want this are going to join the path to a base anyway.I'm not sold on the
followSymlinks
parameter. At worst, this provides a false sense of security and space for TOCTOU bugs; the fact that no symlink exists along that path now says nothing about the future.AndrewHarrisSPU commentedon Oct 20, 2022
I think there are use cases where it's "eventually" or even "never", rather than "immediately". For example, a CLI tool taking a path as an argument might not
Join
until a non-trivial amount of work is done, but if it's going to fail whenSanitize
fails it would be better to reject when parsing arguments.neild commentedon Oct 20, 2022
Thinking more about symlinks, filename sanitization is the wrong time to check for links. Links need to be checked for atomically at file open time, to avoid TOCTOU bugs. Possibly this indicates that we should have a function or functions in the
os
package to securely open files without traversing symlinks, but I don't thinkfilepath
is the right place for this.One issue with
SafeJoin
is thatJoin
treats absolute paths in components after the first as relative paths. I don't think that's the right behavior for handling untrusted paths; an absolute path should be an error, not silently converted to a relative path.Would it be reasonable for
SafeJoin
to be inconsistent withJoin
here?60 remaining items
path/filepath: add IsLocal
gopherbot commentedon Nov 17, 2022
Change https://go.dev/cl/451657 mentions this issue:
path/filepath: detect Windows CONIN$ and CONOUT$ paths in IsLocal
path/filepath: detect Windows CONIN$ and CONOUT$ paths in IsLocal
neild commentedon Nov 18, 2022
IsLocal will be in 1.20.
[System.IO.Path]
method to ensure that a relative path stays inside a directory dotnet/runtime#89785