Skip to content

Inconsistent .gitignore handling of symlinks in cargo package #10032

Closed
@jonhoo

Description

@jonhoo

Problem

cargo package does not handle .gitignore rules the same as git for symlinks. Specifically, in .gitignore, /symlink will match a symlink whether even if it is a symlink to a directory, whereas Cargo will not match symlink in that case, causing it to fail to ignore the symlink and its target.

Similarly, /symlink/ in .gitignore will ignore the files under symlink in git, but not the symlink itself, whereas Cargo just dereferences the symlink and includes the referent directory contents, and essentially ends up ignoring the /symlink/ rule.

Furthermore, Cargo's behavior changes if --allow-dirty is passed. See below.

Steps

$ cargo new cargo-symlink-ignore
$ cd cargo-symlink-ignore
$ ln -s src/ src1
$ echo '/src1' >> .gitignore
$ ln -s src/ src2
$ echo '/src2/' >> .gitignore

Now, cargo --allow-dirty:

$ cargo package -l --allow-dirty | grep -v Cargo
src/main.rs
src1/main.rs
src2/main.rs

Notice that the files under both symlinks are included, and no actual symlinks are included.

Now, commit and try again without --allow-dirty:

$ git add .
$ git commit -m "x"
$ cargo package -l | grep -v Cargo
.cargo_vcs_info.json
.gitignore
src/main.rs
src2/main.rs

Ignoring the .cargo_vcs_info.json file, and the fact that .gitignore is now included(?), notice that this now (correctly) does not include src1 or the contents of its referent directory. But it still includes src2 as a directory even though /src2/ was specified as an ignore.

Constrast that with git ls-files:

$ git ls-files
.gitignore
Cargo.toml
src/main.rs
src2

Which represents the "true" behavior: src1 is not included (it matches /src1), and src2 is included, but only as a symlink.

Possible Solution(s)

Assuming we want to continue to not include symlinks in .crate files, the solution here is to make the walk implementation for --allow-dirty to make /symlink match symlink regardless of whether it points to a directory or not, and to make the git-based walk correctly handle /symlink/ exclusions. My first instinct was that for the former,

let is_dir = path.is_dir();

should be

let is_dir = path.symlink_metadata()?.is_dir();

but unfortunately I don't think that'll work since we should continue to treat the symlink as a directory for recursion purposes. I'm not sure how best to augment the git walking.

Notes

Here is a failing test for --allow-dirty:

fn gitignore_symlink_dir() {
    if !symlink_supported() {
        return;
    }

    project()
        .file("src/main.rs", r#"fn main() { println!("hello"); }"#)
        .symlink_dir("src", "src1")
        .symlink_dir("src", "src2")
        .file(".gitignore", "/src1\n/src2/")
        .build()
        .cargo("package -l")
        .with_stderr("")
        .with_stdout(
            "\
            Cargo.lock\n\
            Cargo.toml\n\
            Cargo.toml.orig\n\
            src/main.rs\n\
         ",
        )
        .run();
}

I'm not sure how to best trigger the git-based walk inside the test suite.

Version

cargo 1.56.0 (4ed5d137b 2021-10-04)
release: 1.56.0
commit-hash: 4ed5d137baff5eccf1bae5a7b2ae4b57efad4a7d
commit-date: 2021-10-04

Though also occurs on master @ 94ca096

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-gitArea: anything dealing with gitC-bugCategory: bug

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions