Skip to content

needless_collect: confusing suggestion & questionable transformation #6164

@roy-work

Description

@roy-work

Take the following example:

fn main() {
  let mock_array = &["foo", "BAR", "baz"];
  let mock_lowercase = mock_array.iter().map(|i| i.to_lowercase()).collect::<Vec<_>>();
  // This is some stuff,
  // to pretend that we have some more
  // code present here
  // like we did
  // in the real
  // deal.
  for _ in 0..2 {
    println!("Is bar present? {:?}", mock_lowercase.contains(&"bar".to_owned()));
  }
}

(Note that in a real-world scenario, all of mock_array, the number of iterations of the for loop, and the input to .contains() might all be dynamic. Don't read too much into that things in the example are constant.)

Running cargo clippy on this gives the following output:

warning: avoid using `collect()` when not needed
  --> src/main.rs:3:3
   |
3  | /   let mock_lowercase = mock_array.iter().map(|i| i.to_lowercase()).collect::<Vec<_>>();
4  | |   // This is some stuff,
5  | |   // to pretend that we have some more
6  | |   // code present here
...  |
10 | |   for _ in 0..2 {
11 | |     println!("Is bar present? {:?}", mock_lowercase.contains(&"bar".to_owned()));
   | |_____________________________________^
   |
   = note: `#[warn(clippy::needless_collect)]` on by default
help: Check if the original Iterator contains an element instead of collecting then checking
   |
3  |
4  |   // This is some stuff,
5  |   // to pretend that we have some more
6  |   // code present here
7  |   // like we did
8  |   // in the real
 ...

warning: 1 warning emitted

There are two issues with this:

The suggestion is elided

The bottom of the diagnostic from clippy quotes lines 3–8; that leaves the reader wondering … why? If I remove those lines (here, a large comment, but in our actual code that we hit this with, it was a mix of code, comments, and blank lines), we see then that clippy is attempting to output a suggestion:

help: Check if the original Iterator contains an element instead of collecting then checking
  |
3 |
4 |   for _ in 0..2 {
5 |     println!("Is bar present? {:?}", mock_array.iter().map(|i| i.to_lowercase()).any(|x| x == &"bar".to_owned()));

Unfortunately, in both our real world case & in this test case, the actual suggestion gets elided.

The suggestion is questionable

Here, the suggestion to not use collect in combination with the for loop means that we are now repeating the .map() call multiple times, for each item. In the original code, the work of the map is done once, outside the for loop, and the collected results then allow us to amortize the cost of that over the many searches done by the for loop. Whether it is worth it to repeat the .map() in each search or to .collect the result into a Vec that the for loop can re-use depends heavily on what the map is doing.

In our test case, we're lowercasing a bunch of strings. Each of the to_lowercase calls will require a heap allocation to store the result. It's enough that I think the original author's use of a Vec isn't wrong.

Meta

  • cargo clippy -V: clippy 0.0.212 (18bf6b4f0 2020-10-07)
  • rustc -Vv:
    rustc 1.47.0 (18bf6b4f0 2020-10-07)
    binary: rustc
    commit-hash: 18bf6b4f01a6feaf7259ba7cdae58031af1b7b39
    commit-date: 2020-10-07
    host: x86_64-apple-darwin
    release: 1.47.0
    LLVM version: 11.0
    

More meta

Worse, in our real world case, since clippy's suggestion was truncated, this resulted in dropping the .collect call, making the iterator mut for any, and replacing contains with any: this transformation is subtly different from the suggestion clippy failed to make: we don't move the creation of the iterator into the for loop; on subsequent passes through the loop, we re-use the partially or fully exhausted iterator. We caught this in code-review, and the subsequent discussion led to this bug report.

The original PR for this lint seemed to just detect cases of ….collect().contains(), which I think is good, always. A subsequent PR added a check for indirection, that is, to still lint that pattern even if we break up the .collect() and .contains() calls, like,

let x = <maps, filters, etc.>.collect();
// …
x.contains();

Which still mostly makes sense; I think the trouble is when that indirection crosses into a loop of some sort. Then, you're not just doing whatever iterator work you were doing once, you're doing it once & amortizing it across the loop. But, the lint doesn't detect that, I think.

I think the some of the relevant code is this? https://github.com/rust-lang/rust/blob/085e4170873f3e411c87ee009572f7d2b5130856/src/tools/clippy/clippy_lints/src/loops.rs#L2598-L2640

Metadata

Metadata

Assignees

No one assigned

    Labels

    C-bugCategory: Clippy is not doing the correct thingC-enhancementCategory: Enhancement of lints, like adding more cases or adding help messagesE-mediumCall for participation: Medium difficulty level problem and requires some initial experience.I-false-positiveIssue: The lint was triggered on code it shouldn't haveL-nurseryLint: Currently in the nursery groupL-suggestionLint: Improving, adding or fixing lint suggestionsgood first issueThese issues are a good way to get started with Clippy

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions