~1.15x faster perf w/ CodeFrontier insertion sort #104

schneems · 2021-11-04T02:14:29Z

Perf difference

Before: 0.230749 0.005489 0.236238 ( 0.237043)
After: 0.197075 0.005009 0.202084 ( 0.202950)

Profile code

To generate profiler output, run:

$ DEBUG_PERF=1 bundle exec rspec spec/integration/dead_end_spec.rb

See the readme for more details. You can do that against the commit before this one, and this one to see the difference.

Before sha: 948ee5c
After sha: b9ff7bd

How I found the issue

Using qcachegrind on Mac, generating an output with RubyProf::CallTreePrinter I saw:

That a lot of time was being spent in CodeFrontier#<<, specifically in the sort! function and CodeBlock<=>.

The fix

Because we control insertion into the array, we know that it is always sorted. We can leverage this info to place new elements in at the right location instead of placing them and then re-sorting the whole array just to place one element.

I experimented with iterating from front to back, and in reverse. I found that in my test case reverse took 7,373 comparisons while forwards took 3,130. Both represent large savings.

After changing this logic:

You can see sort! no longer shows up. Instead, we're seeing reject! as a hotspot (though only taking up 8.4% time instead of 54% time seen in sort).

Strangely such a large bump in results only yielded ~1.15x faster overall performance change. It's still worth it, but not in line with what I expected from the tools.

## Perf difference Before: 0.230749 0.005489 0.236238 ( 0.237043) After: 0.197075 0.005009 0.202084 ( 0.202950) ## Profile code To generate profiler output, run: ``` $ DEBUG_PERF=1 bundle exec rspec spec/integration/dead_end_spec.rb ``` See the readme for more details. You can do that against the commit before this one, and this one to see the difference. ## How I found the issue Using `qcachegrind` on Mac, generating an output with `RubyProf::CallTreePrinter` I saw: ![](https://www.dropbox.com/s/xian4mbsgvi8xr7/Screen%20Shot%202021-11-03%20at%203.00.05%20PM.png?raw=1) That a lot of time was being spent in `CodeFrontier#<<`, specifically in the `sort!` function and `CodeBlock<=>`. ## The fix Because we control insertion into the array, we know that it is always sorted. We can leverage this info to place new elements in at the right location instead of placing them and then re-sorting the whole array just to place one element. I experimented with iterating from front to back, and in reverse. I found that in my test case reverse took 7,373 comparisons while forwards took 3,130. Both represent large savings. After changing this logic: ![](https://www.dropbox.com/s/oba4mmrmlndvkac/Screen%20Shot%202021-11-03%20at%202.58.40%20PM.png?raw=1) You can see `sort!` no longer shows up. Instead, we're seeing `reject!` as a hotspot (though only taking up 8.4% time instead of 54% time seen in sort). Strangely such a large bump in results only yielded ~1.15x faster overall performance change. It's still worth it, but not in line with what I expected from the tools.

schneems added the skip changelog label Nov 4, 2021

schneems merged commit 050a6be into main Nov 4, 2021

schneems deleted the schneems/insertion-sort branch November 4, 2021 02:17

schneems mentioned this pull request Nov 4, 2021

~1.32x Faster checking with CleanDocument regex #105

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

~1.15x faster perf w/ CodeFrontier insertion sort #104

~1.15x faster perf w/ CodeFrontier insertion sort #104

Uh oh!

schneems commented Nov 4, 2021 •

edited

Loading

Uh oh!

Uh oh!

~1.15x faster perf w/ CodeFrontier insertion sort #104

~1.15x faster perf w/ CodeFrontier insertion sort #104

Uh oh!

Conversation

schneems commented Nov 4, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Perf difference

Profile code

How I found the issue

The fix

Uh oh!

Uh oh!

schneems commented Nov 4, 2021 •

edited

Loading