Description
Problem
Currently if we have multiple test files (containing multiple test functions each) under the tests
folder in a crate directory, cargo test
generates a different binary for each of test files. For example below:
crate_directory
|_ src/
|_ tests/
|_ test_file1.rs
|_ test_file2.rs
cargo test
will generate a binary for test_file1.rs
and another one for test_file2.rs
.
This article reports some of the issues with this approach, especially for larger projects/projects with some slow tests:
- Increased linking time:
rustc
will need to link the library for each of these binaries, potentially increasing compilation times. - Critical path execution bottleneck: The binaries are run sequentially (even if the tests inside are multithreaded). If there are many tests inside, but also an abnormaly slow one, even if all the easier tests are finished, we will have to wait for the slow one to finish before starting to run the tests in the next binary. If this issue repeats itself in several binaries, then this can add up to a lot of wasted time where a lot of the CPU is idlying, while just waiting for the hard test to finish.
This can also have some other complications:
3.1) Tests cannot share expensive initialization steps: If many tests, even if they are split across different files, require a common expensive initialization step, this is easy to handle if everything is in the same binary/process: we can execute the computationally hard step, store it with something like lazy_static
and then load it in the tests. This way you only calculate the expensive step once, and the use it many times. Of course, this is only possible if all the tests are in the same binary. But if they are split across different binaries, you have to run the computationally expensive step once per binary, wasting time (considering that binaries are run sequentially).
3.2) Systems resources have to be shared among parallel test. This is not an issue in cargo test
but is, for example, in cargo nextest
, since it does not use the threading model. If some tests require not a computationally expensive step, but one that uses a large fraction of the ram (loading a very big matrix for example), this will limit parallelization, since we will need to load this large amount of data in every binary, limiting the number of binaries that can be run concurrently.
Proposed Solution
One way to currently solve this is to structure the tests according to the following:
crate_directory
|_ src/
|_ tests/
|_ test_files/
|_ main.rs (with as many lines as files, "mod test_file1;", "mod test_file2;", etc)
|_ test_file1.rs
|_ test_file2.rs
However, I suggest that cargo test
could allow compiling all the tests to the same binary even if we use the original organization:
crate_directory
|_ src/
|_ tests/
|_ test_file1.rs
|_ test_file2.rs
Hypothetical solutions:
a) Change the behaviour of cargo test
so that everything is run under the same binary by default. I would imagine however, that this would have backward compatibility issues, especially with multithreaded apps using unsafe
code?
b) Simply add a flag that would enable this.
Note: cargo nextest
, with its paralellization model using jobs, is able to solve issue (2). However, using processes instead of threads have the disadvantage of potentially making issues (3.1) and (3.2) worse.
Notes
No response
Metadata
Metadata
Assignees
Labels
Type
Projects
Status