-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Dramatic slowdown in rust performance from the serialization benchmarks #19281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
It looks like this is coming from cf3b2e4, where |
This is definitely getting out of my area of expertise, but apparently switching pub fn reserve(&mut self, additional: uint) {
if self.cap - self.len < additional {
match self.len.checked_add(additional) {
None => panic!("Vec::reserve: `uint` overflow"),
// if the checked_add
Some(new_cap) => {
let amort_cap = new_cap.next_power_of_two();
// next_power_of_two will overflow to exactly 0 for really big capacities
if amort_cap == 0 {
self.grow_capacity(new_cap);
} else {
self.grow_capacity(amort_cap);
}
}
}
}
} To this: pub fn reserve(&mut self, additional: uint) {
if self.cap - self.len < additional {
match self.len.checked_add(additional) {
None => panic!("Vec::reserve: `uint` overflow"),
Some(new_cap) => {
let amort_cap = new_cap.next_power_of_two();
// next_power_of_two will overflow to exactly 0 for really big capacities
let capacity = if amort_cap == 0 { new_cap } else { amort_cap };
self.grow_capacity(capacity)
}
}
}
} Is sufficient to get my micro-benchmark's |
Somehow llvm is able to optimize this version of Vec::reserve into dramatically faster than the old version. In micro-benchmarks this was 2-10 times faster. It also shaved 14 minutes off of rust's compile times. Closes rust-lang#19281.
(I don't understand why this works, and so I don't quite trust this yet. I'm pushing it up to see if anyone else can replicate this performance increase) Somehow llvm is able to optimize this version of Vec::reserve into dramatically faster than the old version. In micro-benchmarks this was 2-10 times faster. It also reduce my Rust compile time from 41 minutes to 27 minutes. Closes #19281.
I've been doing a lot of benchmarking recently (1, 2, 3), and I've seen a pretty dramatic drop in performance over the past couple weeks. While some of it might be explained from upgrading from OSX Mavericks to Yosemite, I still saw a 40% drop in performance between 2014-11-13 and 2014-11-23. I haven't been able to dig into what's going on yet, but I did see that our current implementation of
Writer
for&mut [u8]
:Does not appear to be inlining well for some reason:
Rewriting to this makes it 10 times faster (and yes, I realize I'm not updating the length of the
Vec<u8>
. Could that be a problem?):with this performance:
Furthermore, commenting out the
self.dst.reserve(src_len)
made it just as fast asBufWriter
and directly using the unsafeptr::copy_nonoverlapping_memory
.The text was updated successfully, but these errors were encountered: