Description
Problem
cargo publish
fails with:
memory allocation of 1099511627782 bytes failed
zsh: abort (core dumped) cargo publish --dry-run
...if you have large files in your build directory, even if they are excluded.
I expect this to complete in all cases, i.e. a crate can be published regardless of how much memory I have (within reason!). But, I especially expect it to pass in this case, where the file causing the error is ignored, and will not be published.
Steps
- Have a test data directory which is ignored.
mkdir -p tests/generated
echo '*' > tests/generated/.gitignore
g add -f tests/generated/.gitignore
- Have a
build.rs
which creates a large file in this test data directory.
use std::io::{Seek, SeekFrom, Write};
use std::fs;
fn main() {
let mut f = fs::File::create("tests/generated/large.txt").unwrap();
f.seek(SeekFrom::Start(1024 * 1024 * 1024 * 1024)).unwrap();
f.write_all(b"hello").unwrap();
}
% du -h tests/generated/large.txt
4.0K tests/generated/large.txt
% du --apparent-size -h tests/generated/large.txt
1.1T tests/generated/large.txt
- Run
cargo package
orcargo publish
.
% cargo +stable publish --dry-run --allow-dirty
Updating crates.io index
warning: manifest has no description, license, license-file, documentation, homepage or repository.
See https://doc.rust-lang.org/cargo/reference/manifest.html#package-metadata for more info.
Packaging dozen v0.1.0 (foo/dozen)
Verifying dozen v0.1.0 (foo/dozen)
Compiling dozen v0.1.0 (foo/dozen/target/package/dozen-0.1.0)
Finished dev [unoptimized + debuginfo] target(s) in 0.53s
memory allocation of 1099511627782 bytes failed
zsh: abort (core dumped) cargo +stable publish --dry-run --allow-dirty
Possible Solution(s)
I believe this was introduced by 78a60bc , which intentionally hashes all of build.rs
's output, to detect changes it has made. I believe that change is maybe correct in this case; if build.rs
changed the test data, then maybe the tests will perform differently, so it's a different test?
The allocation is failing here, which you can see with gdb
, but not with logging or backtraces:
cargo/src/cargo/ops/cargo_package.rs
Lines 693 to 694 in 759431f
i.e. read the entire file into memory, then hash the contents. Chunked hashing should not require unbounded memory?
Maybe:
diff --git a/src/cargo/ops/cargo_package.rs b/src/cargo/ops/cargo_package.rs
index a1b9a5f6..804ddb26 100644
--- a/src/cargo/ops/cargo_package.rs
+++ b/src/cargo/ops/cargo_package.rs
@@ -690,6 +690,7 @@ fn hash_all(path: &Path) -> CargoResult<HashMap<PathBuf, u64>> {
let entry = entry?;
let file_type = entry.file_type();
if file_type.is_file() {
+ debug!("hashing {:?}", entry.path());
let contents = fs::read(entry.path())?;
let hash = util::hex::hash_u64(&contents);
result.insert(entry.path().to_path_buf(), hash);
I saw this in a real project, https://github.com/FauxFaux/ext4-rs , which extracts some sparse (i.e. tiny but apparently large) disc images from a tiny tar file during its tests. It publishes fine on e.g. 1.34.2
, but not on stable.
Notes
Output of cargo version
:
% cargo +stable --version
cargo 1.38.0 (23ef9a4ef 2019-08-20)
I'm on amd64 Ubuntu with ~30gb of memory available. If you have over 1TB of virtual memory available, then the above testcase might pass (lucky you).