-
Notifications
You must be signed in to change notification settings - Fork 13.4k
slice.get(i)
should use a slice projection in MIR, like slice[i]
does
#139118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
rustbot has assigned @workingjubilee. Use |
bb2: { | ||
StorageDead(_3); | ||
StorageLive(_7); | ||
StorageLive(_5); | ||
_5 = &raw mut (*_1); | ||
StorageLive(_6); | ||
_6 = copy _5 as *mut u32 (PtrToPtr); | ||
_7 = Offset(copy _6, copy _2); | ||
StorageDead(_6); | ||
StorageDead(_5); | ||
_8 = &mut (*_7); | ||
_0 = Option::<&mut u32>::Some(copy _8); | ||
StorageDead(_7); | ||
_5 = &mut (*_1)[_2]; | ||
_0 = Option::<&mut u32>::Some(copy _5); | ||
goto -> bb3; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
annot: this pre-codegen test is the clearest demonstration of what the PR does.
library/core/src/intrinsics/mod.rs
Outdated
index: usize, | ||
) -> ItemPtr; | ||
|
||
pub trait SliceGetUnchecked<T> {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A brief doc comment would be good, it is a public trait after all. (I am surprised CI does not complain.)
Reminder, once the PR becomes ready for a review, use |
4cf8aa1
to
bb5dd9b
Compare
Ok, updated:
@rustbot ready |
This comment has been minimized.
This comment has been minimized.
bb5dd9b
to
5d9cc7a
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
5d9cc7a
to
938fe84
Compare
This comment has been minimized.
This comment has been minimized.
uhh, sorry for not noticing this earlier, can you clarify its status? @rustbot author |
Looks like we got really unlucky in the linux-futex test... and/or the probabilities shifted since this affects a lot of MIR. Chances are if you rebase and try again, it'll work this time. I can play around with the numbers in that test later. |
938fe84
to
8e85afd
Compare
Thanks, Ralf! Looks like it worked out with just the rebase, as you predicted. @rustbot ready |
Sorry, I noticed while scanning the queue that this is not rollup=never... it was not run through perf, it seems? My fault. The odds this does not affect perf significantly seem... small. Will reapprove assuming it's perf-neutral. @bors r- @bors2 try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
`slice.get(i)` should use a slice projection in MIR, like `slice[i]` does `slice[i]` is built-in magic, so ends up being quite different from `slice.get(i)` in MIR, even though they're both doing nearly identical operations -- checking the length of the slice then getting a ref/ptr to the element if it's in-bounds. This PR adds a `slice_get_unchecked` intrinsic for `impl SliceIndex for usize` to use to fix that, so it no longer needs to do a bunch of lines of pointer math and instead just gets the obvious single statement. (This is *not* used for the range versions, since `slice[i..]` and `slice[..k]` can't use the mir Slice projection as they're using fenceposts, not indices.) I originally tried to do this with some kind of GVN pattern, but realized that I'm pretty sure it's not legal to optimize `BinOp::Offset` to `PlaceElem::Index` without an extremely complicated condition. Basically, the problem is that the `Index` projection on a dereferenced slice pointer *cares about the metadata*, since it's UB to `PlaceElem::Index` outside the range described by the metadata. But then you cast the fat pointer to a thin pointer then offset it, that *ignores* the slice length metadata, so it's possible to write things that are legal with `Offset` but would be UB if translated in the obvious way to `Index`. Checking (or even determining) the necessary conditions for that would be complicated and error-prone, whereas this intrinsic-based approach is quite straight-forward. Zero backend changes, because it just lowers to MIR, so it's already supported naturally by CTFE/Miri/cg_llvm/cg_clif.
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (462bbcc): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.
Max RSS (memory usage)Results (primary -0.3%, secondary -0.3%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (primary -0.8%, secondary -0.4%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeResults (primary -0.2%, secondary -0.2%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Bootstrap: 778.613s -> 778.03s (-0.07%) |
icount changes are neutral (slightly green overall, but too few to really say that matters), bootstrap is slightly faster, and there's a small but pervasive binary size improvement (not just for debug either, with cargo & image improving in opt-full). Seems good to go to me. @bors r=workingjubilee |
☀️ Test successful - checks-actions |
What is this?This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.Comparing 4d08223 (parent) -> f0999ff (this PR) Test differencesShow 104 test diffs104 doctest diffs were found. These are ignored, as they are noisy. Test dashboardRun cargo run --manifest-path src/ci/citool/Cargo.toml -- \
test-dashboard f0999ffdc4818e498949d3b1f2a0ce6be02a0436 --output-dir test-dashboard And then open Job duration changes
How to interpret the job duration changes?Job durations can vary a lot, based on the actual runner instance |
Finished benchmarking commit (f0999ff): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowOur benchmarks found a performance regression caused by this PR. Next Steps:
@rustbot label: +perf-regression Instruction countThis is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.
Max RSS (memory usage)Results (primary -0.0%, secondary -0.8%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (secondary -0.5%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeResults (primary -0.2%, secondary -0.2%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Bootstrap: 776.152s -> 774.682s (-0.19%) |
Within noise threshold for (new) regressed benchmark, otherwise see #139118 (comment) |
…ic, r=workingjubilee `slice.get(i)` should use a slice projection in MIR, like `slice[i]` does `slice[i]` is built-in magic, so ends up being quite different from `slice.get(i)` in MIR, even though they're both doing nearly identical operations -- checking the length of the slice then getting a ref/ptr to the element if it's in-bounds. This PR adds a `slice_get_unchecked` intrinsic for `impl SliceIndex for usize` to use to fix that, so it no longer needs to do a bunch of lines of pointer math and instead just gets the obvious single statement. (This is *not* used for the range versions, since `slice[i..]` and `slice[..k]` can't use the mir Slice projection as they're using fenceposts, not indices.) I originally tried to do this with some kind of GVN pattern, but realized that I'm pretty sure it's not legal to optimize `BinOp::Offset` to `PlaceElem::Index` without an extremely complicated condition. Basically, the problem is that the `Index` projection on a dereferenced slice pointer *cares about the metadata*, since it's UB to `PlaceElem::Index` outside the range described by the metadata. But then you cast the fat pointer to a thin pointer then offset it, that *ignores* the slice length metadata, so it's possible to write things that are legal with `Offset` but would be UB if translated in the obvious way to `Index`. Checking (or even determining) the necessary conditions for that would be complicated and error-prone, whereas this intrinsic-based approach is quite straight-forward. Zero backend changes, because it just lowers to MIR, so it's already supported naturally by CTFE/Miri/cg_llvm/cg_clif.
slice[i]
is built-in magic, so ends up being quite different fromslice.get(i)
in MIR, even though they're both doing nearly identical operations -- checking the length of the slice then getting a ref/ptr to the element if it's in-bounds.This PR adds a
slice_get_unchecked
intrinsic forimpl SliceIndex for usize
to use to fix that, so it no longer needs to do a bunch of lines of pointer math and instead just gets the obvious single statement. (This is not used for the range versions, sinceslice[i..]
andslice[..k]
can't use the mir Slice projection as they're using fenceposts, not indices.)I originally tried to do this with some kind of GVN pattern, but realized that I'm pretty sure it's not legal to optimize
BinOp::Offset
toPlaceElem::Index
without an extremely complicated condition. Basically, the problem is that theIndex
projection on a dereferenced slice pointer cares about the metadata, since it's UB toPlaceElem::Index
outside the range described by the metadata. But then you cast the fat pointer to a thin pointer then offset it, that ignores the slice length metadata, so it's possible to write things that are legal withOffset
but would be UB if translated in the obvious way toIndex
. Checking (or even determining) the necessary conditions for that would be complicated and error-prone, whereas this intrinsic-based approach is quite straight-forward.Zero backend changes, because it just lowers to MIR, so it's already supported naturally by CTFE/Miri/cg_llvm/cg_clif.