Skip to content

Speed up some use cases of ArrayPartition #163

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

mateuszbaran
Copy link
Contributor

This is the first batch of changes related to #160 . Most changes are related to the fact that map is much more friendly to Julia compiler than broadcasting so I've replaced a bunch of unnecessary broadcasts with map. The other thing is that the loop in assignment broadcast is very problematic -- ideally compiler should be able to constant-propagate N and unroll the loop but it seems that doing so breaks some other optimizations. This change helps a bit but it's still not enough.

@mateuszbaran
Copy link
Contributor Author

I've improved that copyto! method and now my example from #160 is about 100x faster than before 🎉 :

julia> @benchmark f!($c, $a, $b)
BenchmarkTools.Trial: 10000 samples with 999 evaluations.
 Range (min  max):  7.885 ns  34.613 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     7.972 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   7.990 ns ±  0.406 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

                                  ▂▂█▁▄▄                      
  ▂▂▂▂▁▁▂▁▂▂▁▂▂▂▂▂▂▃▃▃▃▃▃▄▄▅▄▅▄▄▇▆██████▇▇▄▅▅▄▇▆▇▄▄▃▃▂▂▃▂▃▃▃ ▃
  7.88 ns        Histogram: frequency by time         830 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

Copy link
Member

@ChrisRackauckas ChrisRackauckas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yes, just mapping over the tuple is a lot better, great idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants