~1.15x faster perf w/ CodeFrontier insertion sort #104
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Perf difference
Before: 0.230749 0.005489 0.236238 ( 0.237043)
After: 0.197075 0.005009 0.202084 ( 0.202950)
Profile code
To generate profiler output, run:
See the readme for more details. You can do that against the commit before this one, and this one to see the difference.
Before sha: 948ee5c
After sha: b9ff7bd
How I found the issue
Using
qcachegrind
on Mac, generating an output withRubyProf::CallTreePrinter
I saw:That a lot of time was being spent in
CodeFrontier#<<
, specifically in thesort!
function andCodeBlock<=>
.The fix
Because we control insertion into the array, we know that it is always sorted. We can leverage this info to place new elements in at the right location instead of placing them and then re-sorting the whole array just to place one element.
I experimented with iterating from front to back, and in reverse. I found that in my test case reverse took 7,373 comparisons while forwards took 3,130. Both represent large savings.
After changing this logic:
You can see
sort!
no longer shows up. Instead, we're seeingreject!
as a hotspot (though only taking up 8.4% time instead of 54% time seen in sort).Strangely such a large bump in results only yielded ~1.15x faster overall performance change. It's still worth it, but not in line with what I expected from the tools.