Skip to content

Conversation

petrosagg
Copy link
Contributor

@petrosagg petrosagg commented May 15, 2025

The implementation of the Vec::extract_if iterator violates the safety contract adverized by slice::from_raw_parts by always constructing a mutable slice for the entire length of the vector even though that span of memory can contain holes from items already drained. The safety contract of slice::from_raw_parts requires that all elements must be properly
initialized.

As an example we can look at the following code:

let mut v = vec![Box::new(0u64), Box::new(1u64)];
for item in v.extract_if(.., |x| **x == 0) {
    drop(item);
}

In the second iteration a &mut [Box<u64>] slice of length 2 will be constructed. The first slot of the slice contains the bitpattern of an already deallocated box, which is invalid.

This fixes the issue by only creating references to valid items and using pointer manipulation for the rest. I have also taken the liberty to remove the big unsafe blocks in place of targetted ones with a SAFETY comment. The approach closely mirrors the implementation of Vec::retain_mut.

Note to reviewers: The diff is easier to follow with whitespace hidden.

@rustbot
Copy link
Collaborator

rustbot commented May 15, 2025

r? @joboet

rustbot has assigned @joboet.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels May 15, 2025
@petrosagg
Copy link
Contributor Author

I was unable to trigger a miri failure with this. I seems to have something to do with how deeply value validity is checked because miri correctly flags this code:

let ptr = std::ptr::null_mut();
let b = unsafe { std::mem::transmute::<*mut u64, Box<u64>>(ptr) };

but doesn't detect UB in this code:

let mut ptr = std::ptr::null_mut();
let b = unsafe { std::mem::transmute::<&mut *mut u64, &mut Box<u64>>(&mut ptr) };

@petrosagg
Copy link
Contributor Author

@RalfJung if you have any pointers for how to write a miri test for this PR I'm happy to have another go at it.

@rust-log-analyzer

This comment has been minimized.

@pitust
Copy link

pitust commented May 16, 2025

@petrosagg miri reports the error if you run with MIRIFLAGS=-Zmiri-recursive-validation:

error: Undefined Behavior: constructing invalid value at .<deref>[0]: encountered a dangling box (use-after-free)
   --> /home/pitust/.rustup/toolchains/nightly-aarch64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/slice/raw.rs:192:9
    |
192 |         &mut *ptr::slice_from_raw_parts_mut(data, len)
    |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ constructing invalid value at .<deref>[0]: encountered a dangling box (use-after-free)
    |
    = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
    = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
    = note: BACKTRACE:
    = note: inside `std::slice::from_raw_parts_mut::<'_, std::boxed::Box<u64>>` at /home/pitust/.rustup/toolchains/nightly-aarch64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/slice/raw.rs:192:9: 192:55
    = note: inside `<std::vec::ExtractIf<'_, std::boxed::Box<u64>, {closure@src/main.rs:100:34: 100:37}> as std::iter::Iterator>::next` at /home/pitust/.rustup/toolchains/nightly-aarch64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/vec/extract_if.rs:71:25: 71:87
note: inside `main`
   --> src/main.rs:100:17
    |
100 |     for item in v.extract_if(.., |x| **x == 0) {
    |                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

@est31 est31 added the I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness label May 16, 2025
@rustbot rustbot added the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label May 16, 2025
@saethlin saethlin removed the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label May 16, 2025
@petrosagg
Copy link
Contributor Author

oh, amazing! I will add a testcase

@RalfJung
Copy link
Member

RalfJung commented May 16, 2025 via email

@petrosagg
Copy link
Contributor Author

I see. This particular test exercises very little code but I don't have a strong opinion either. If we want to avoid using this experimental feature until the validity UB rules become more established I can leave the PR as-is.

Any opinions as to whether a testcase for this would be useful to have?

@tgross35
Copy link
Contributor

tgross35 commented May 16, 2025

Cc @the8472 who I believe authored this.

Any opinions as to whether a testcase for this would be useful to have?

Something that exercises the UB with the current implementation would be great, even if it doesn't currently get flagged by default miri.

@rustbot
Copy link
Collaborator

rustbot commented May 16, 2025

These commits modify the Cargo.lock file. Unintentional changes to Cargo.lock can be introduced when switching branches and rebasing PRs.

If this was unintentional then you should revert the changes before this PR is merged.
Otherwise, you can ignore this comment.

The Miri subtree was changed

cc @rust-lang/miri

@petrosagg
Copy link
Contributor Author

@tgross35 great! Just added a miri testcase for this. I did pass the experimental flag, meaning that the test fails without the fix.

@the8472
Copy link
Member

the8472 commented May 16, 2025

I only did some later modifications, this was originally introduced as drain_filter in #43245

In the second iteration a &mut [Box] slice of length 2 will be constructed. The first slot of the slice contains the bitpattern of an already deallocated box, causing UB.

The bytes are still initialized, we don't de-init the bytes when moving the value out. It happens to be the bytes of a pointer that points to a deallocated object.

AIUI validity requirements are still under discussion, but leaning no in this case: rust-lang/unsafe-code-guidelines#412

That said, the slice construction is superfluous and if we can replace it with something that's easier to reason about that's fine.

@pitust
Copy link

pitust commented May 16, 2025

Note that that flag enables UB checks that are very strict, and goes beyond what we are sure will be UB.

This is true. However, I would argue that the standard to which the standard library should be held to is "definitely not UB" and not "maybe UB or maybe not", especially if the fix is not particularly difficult and doesn't have a meaningful runtime impact.

@saethlin
Copy link
Member

I do not think it makes sense to hold this one part of the library to a standard that the rest of it, let alone the compiler's output, does not come close to satisfying: rust-lang/unsafe-code-guidelines#412 (comment)

@petrosagg petrosagg changed the title fix unsoundness in Vec::extract_if avoid violating slice::from_raw_parts safety contract in Vec::extract_if May 16, 2025
@petrosagg
Copy link
Contributor Author

In light of the discussion from the unsafe code guidelines repo it's clear that this PR is in a gray area as to whether it fixes UB or not. The current safety contract for slice::from_raw_parts clearly says it's UB. At the same time many people in the project think it shouldn't be. Let's not decide this in this PR.

I'd argue this PR is valuable on its own even if this behavior is not deemed UB in the future for two reasons:

  1. It avoids violating Rust's own documented rules, leading to less confusion to people reading the code. This is how I ended up here in the first place.
  2. It makes targeted and justified use of unsafe instead of big unsafe blocks.

I have removed the miri testcase that uses the experimental flag and have rephrased the commit and description of the PR to reflect this new framing.

@RalfJung
Copy link
Member

There is no language UB here at the moment. There is library UB but this code is inside the same library, so that is not necessarily a problem -- it boils down to a matter of preference of the libs team:

  • either add a comment saying that yes we are violating the library precondition here but we can rely on implementation details of from_raw_parts and hence it's okay (but it would not be okay to do the same outside the standard library)
  • or change the code to avoid the library UB

Miri does not test for library UB, so you can't add a Miri test. I don't think it's worth having a test with -Zmiri-recursive-validation. However it'd still be worth adding the code that reproduces the issue to miri/tests/pass/vec.rs so that we ensure that there isn't language UB here in the future.

@est31 est31 removed the I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness label May 17, 2025
ptr::copy_nonoverlapping(src, dst, 1);
while self.idx < self.end {
let i = self.idx;
// SAFETY: Unchecked element must be valid.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just stating a safety precondition of the mutable reference creation, but it doesn't explain why the operation is sound. The important thing here is that the index must be within the length of the original vector since it is smaller than self.end, which itself is smaller than the vectors original length.

Also, it would be helpful to add a comment that explains why we can't just use self.vec.get_unchecked_mut here (we can't since self.vec's length was set to zero, so indexing by i would violate its safety precondition and trigger the UB-check).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The important thing here is that the index must be within the length of the original vector since it is smaller than self.end, which itself is smaller than the vectors original length.

I will alter this to also mention that self.idx < self.end if that's important, but the main thing here is not that, it's that some elements inside the vector have been dropped and only the yet-unchecked elements are definitely still valid. This is why the safety comment refers to the fact that we're accessing an unchecked element as the reasoning why it's not UB. fwiw, the same safety argument is made in Vec::retain here https://github.com/rust-lang/rust/blob/1.87.0/library/alloc/src/vec/mod.rs#L2209

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have update the text! Sorry for the delayed response

@rustbot rustbot removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jun 5, 2025
@rustbot rustbot added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Jun 5, 2025
@alex-semenyuk
Copy link
Member

@petrosagg
Thanks for your contribution
From wg-triage. Could you please handle suggestion from comment above

@Dylan-DPC
Copy link
Member

@petrosagg any updates on this? thanks

@rustbot

This comment has been minimized.

@petrosagg
Copy link
Contributor Author

@Dylan-DPC I just addressed the review and rebased on top of current master. Let me know if you would like me to squash the review changes in the original commit

@RalfJung
Copy link
Member

As I said before:

However it'd still be worth adding the code that reproduces the issue to miri/tests/pass/vec.rs so that we ensure that there isn't language UB here in the future.

Otherwise, once you think this is ready for review, please write @rustbot ready.

@RalfJung
Copy link
Member

RalfJung commented Sep 20, 2025

I am not sure how you misinterpreted what I said, but let me repeat the suggestion, with some parts in bold:

However it'd still be worth adding the code that reproduces the issue to miri/tests/pass/vec.rs so that we ensure that there isn't language UB here in the future.

Please read suggestions carefully. Review takes a lot more time and patience when we have to repeat ourselves.

@petrosagg
Copy link
Contributor Author

@RalfJung I apologize, I reverted the miri test commit from the previous version of the PR and didn't notice the request for the different path. I moved it in its appropriate place now

input.replace_range(0..0, "0");
}

fn extract_if() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn extract_if() {
/// This was skirting the edge of UB, let's make sure it remains on the sound side.
/// Context: <https://github.com/rust-lang/rust/pull/141032>.
fn extract_if() {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review, added the clarifying comment

@petrosagg
Copy link
Contributor Author

@rustbot ready

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Sep 21, 2025
@tgross35
Copy link
Contributor

(please be sure to squash before merge)

…act_if`

The implementation of the `Vec::extract_if` iterator violates the safety
contract adverized by `slice::from_raw_parts` by always constructing a
mutable slice for the entire length of the vector even though that span
of memory can contain holes from items already drained. The safety
contract of `slice::from_raw_parts` requires that all elements must be
properly initialized.

As an example we can look at the following code:

```rust
let mut v = vec![Box::new(0u64), Box::new(1u64)];
for item in v.extract_if(.., |x| **x == 0) {
    drop(item);
}
```

In the second iteration a `&mut [Box<u64>]` slice of length 2 will be
constructed. The first slot of the slice contains the bitpattern of an
already deallocated box, which is invalid.

This fixes the issue by only creating references to valid items and
using pointer manipulation for the rest. I have also taken the liberty
to remove the big `unsafe` blocks in place of targetted ones with a
SAFETY comment. The approach closely mirrors the implementation of
`Vec::retain_mut`.

Signed-off-by: Petros Angelatos <[email protected]>
@petrosagg
Copy link
Contributor Author

@tgross35 commits are now squashed

@joboet
Copy link
Member

joboet commented Sep 25, 2025

While the old implementation wasn't unsound per-se, I much prefer the new one (it is also much better documented). Thank you (and everyone who participated) for your efforts!

@bors r+

@bors
Copy link
Collaborator

bors commented Sep 25, 2025

📌 Commit e9b2c4f has been approved by joboet

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 25, 2025
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Sep 25, 2025
avoid violating `slice::from_raw_parts` safety contract in `Vec::extract_if`

The implementation of the `Vec::extract_if` iterator violates the safety contract adverized by `slice::from_raw_parts` by always constructing a mutable slice for the entire length of the vector even though that span of memory can contain holes from items already drained. The safety contract of `slice::from_raw_parts` requires that all elements must be properly
initialized.

As an example we can look at the following code:

```rust
let mut v = vec![Box::new(0u64), Box::new(1u64)];
for item in v.extract_if(.., |x| **x == 0) {
    drop(item);
}
```

In the second iteration a `&mut [Box<u64>]` slice of length 2 will be constructed. The first slot of the slice contains the bitpattern of an already deallocated box, which is invalid.

This fixes the issue by only creating references to valid items and using pointer manipulation for the rest. I have also taken the liberty to remove the big `unsafe` blocks in place of targetted ones with a SAFETY comment. The approach closely mirrors the implementation of `Vec::retain_mut`.

**Note to reviewers:** The diff is easier to follow with whitespace hidden.
bors added a commit that referenced this pull request Sep 25, 2025
Rollup of 8 pull requests

Successful merges:

 - #116882 (rustdoc: hide `#[repr]` if it isn't part of the public ABI)
 - #135771 ([rustdoc] Add support for associated items in "jump to def" feature)
 - #141032 (avoid violating `slice::from_raw_parts` safety contract in `Vec::extract_if`)
 - #142401 (Add proper name mangling for pattern types)
 - #146293 (feat: non-panicking `Vec::try_remove`)
 - #146859 (BTreeMap: Don't leak allocators when initializing nodes)
 - #146924 (Add doc for `NonZero*` const creation)
 - #146933 (Make `render_example_with_highlighting` return an `impl fmt::Display`)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit e3f7626 into rust-lang:master Sep 25, 2025
10 checks passed
@rustbot rustbot added this to the 1.92.0 milestone Sep 25, 2025
rust-timer added a commit that referenced this pull request Sep 25, 2025
Rollup merge of #141032 - petrosagg:extract-if-ub, r=joboet

avoid violating `slice::from_raw_parts` safety contract in `Vec::extract_if`

The implementation of the `Vec::extract_if` iterator violates the safety contract adverized by `slice::from_raw_parts` by always constructing a mutable slice for the entire length of the vector even though that span of memory can contain holes from items already drained. The safety contract of `slice::from_raw_parts` requires that all elements must be properly
initialized.

As an example we can look at the following code:

```rust
let mut v = vec![Box::new(0u64), Box::new(1u64)];
for item in v.extract_if(.., |x| **x == 0) {
    drop(item);
}
```

In the second iteration a `&mut [Box<u64>]` slice of length 2 will be constructed. The first slot of the slice contains the bitpattern of an already deallocated box, which is invalid.

This fixes the issue by only creating references to valid items and using pointer manipulation for the rest. I have also taken the liberty to remove the big `unsafe` blocks in place of targetted ones with a SAFETY comment. The approach closely mirrors the implementation of `Vec::retain_mut`.

**Note to reviewers:** The diff is easier to follow with whitespace hidden.
github-actions bot pushed a commit to rust-lang/miri that referenced this pull request Sep 26, 2025
Rollup of 8 pull requests

Successful merges:

 - rust-lang/rust#116882 (rustdoc: hide `#[repr]` if it isn't part of the public ABI)
 - rust-lang/rust#135771 ([rustdoc] Add support for associated items in "jump to def" feature)
 - rust-lang/rust#141032 (avoid violating `slice::from_raw_parts` safety contract in `Vec::extract_if`)
 - rust-lang/rust#142401 (Add proper name mangling for pattern types)
 - rust-lang/rust#146293 (feat: non-panicking `Vec::try_remove`)
 - rust-lang/rust#146859 (BTreeMap: Don't leak allocators when initializing nodes)
 - rust-lang/rust#146924 (Add doc for `NonZero*` const creation)
 - rust-lang/rust#146933 (Make `render_example_with_highlighting` return an `impl fmt::Display`)

r? `@ghost`
`@rustbot` modify labels: rollup
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.