-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
Take the following example:
fn main() {
let mock_array = &["foo", "BAR", "baz"];
let mock_lowercase = mock_array.iter().map(|i| i.to_lowercase()).collect::<Vec<_>>();
// This is some stuff,
// to pretend that we have some more
// code present here
// like we did
// in the real
// deal.
for _ in 0..2 {
println!("Is bar present? {:?}", mock_lowercase.contains(&"bar".to_owned()));
}
}
(Note that in a real-world scenario, all of mock_array
, the number of iterations of the for
loop, and the input to .contains()
might all be dynamic. Don't read too much into that things in the example are constant.)
Running cargo clippy
on this gives the following output:
warning: avoid using `collect()` when not needed
--> src/main.rs:3:3
|
3 | / let mock_lowercase = mock_array.iter().map(|i| i.to_lowercase()).collect::<Vec<_>>();
4 | | // This is some stuff,
5 | | // to pretend that we have some more
6 | | // code present here
... |
10 | | for _ in 0..2 {
11 | | println!("Is bar present? {:?}", mock_lowercase.contains(&"bar".to_owned()));
| |_____________________________________^
|
= note: `#[warn(clippy::needless_collect)]` on by default
help: Check if the original Iterator contains an element instead of collecting then checking
|
3 |
4 | // This is some stuff,
5 | // to pretend that we have some more
6 | // code present here
7 | // like we did
8 | // in the real
...
warning: 1 warning emitted
There are two issues with this:
The suggestion is elided
The bottom of the diagnostic from clippy
quotes lines 3–8; that leaves the reader wondering … why? If I remove those lines (here, a large comment, but in our actual code that we hit this with, it was a mix of code, comments, and blank lines), we see then that clippy
is attempting to output a suggestion:
help: Check if the original Iterator contains an element instead of collecting then checking
|
3 |
4 | for _ in 0..2 {
5 | println!("Is bar present? {:?}", mock_array.iter().map(|i| i.to_lowercase()).any(|x| x == &"bar".to_owned()));
Unfortunately, in both our real world case & in this test case, the actual suggestion gets elided.
The suggestion is questionable
Here, the suggestion to not use collect
in combination with the for
loop means that we are now repeating the .map()
call multiple times, for each item. In the original code, the work of the map
is done once, outside the for loop, and the collected results then allow us to amortize the cost of that over the many searches done by the for
loop. Whether it is worth it to repeat the .map()
in each search or to .collect
the result into a Vec
that the for
loop can re-use depends heavily on what the map
is doing.
In our test case, we're lowercasing a bunch of strings. Each of the to_lowercase
calls will require a heap allocation to store the result. It's enough that I think the original author's use of a Vec
isn't wrong.
Meta
cargo clippy -V
: clippy 0.0.212 (18bf6b4f0 2020-10-07)rustc -Vv
:rustc 1.47.0 (18bf6b4f0 2020-10-07) binary: rustc commit-hash: 18bf6b4f01a6feaf7259ba7cdae58031af1b7b39 commit-date: 2020-10-07 host: x86_64-apple-darwin release: 1.47.0 LLVM version: 11.0
More meta
Worse, in our real world case, since clippy's suggestion was truncated, this resulted in dropping the .collect
call, making the iterator mut
for any
, and replacing contains
with any
: this transformation is subtly different from the suggestion clippy failed to make: we don't move the creation of the iterator into the for loop; on subsequent passes through the loop, we re-use the partially or fully exhausted iterator. We caught this in code-review, and the subsequent discussion led to this bug report.
The original PR for this lint seemed to just detect cases of ….collect().contains()
, which I think is good, always. A subsequent PR added a check for indirection, that is, to still lint that pattern even if we break up the .collect()
and .contains()
calls, like,
let x = <maps, filters, etc.>.collect();
// …
x.contains(…);
Which still mostly makes sense; I think the trouble is when that indirection crosses into a loop of some sort. Then, you're not just doing whatever iterator work you were doing once, you're doing it once & amortizing it across the loop. But, the lint doesn't detect that, I think.
I think the some of the relevant code is this? https://github.com/rust-lang/rust/blob/085e4170873f3e411c87ee009572f7d2b5130856/src/tools/clippy/clippy_lints/src/loops.rs#L2598-L2640