Skip to content

[Question]What is the parallelization threshold value for Caxpy benchmark? #2090

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
tianyizhangcs opened this issue Apr 16, 2019 · 5 comments

Comments

@tianyizhangcs
Copy link

tianyizhangcs commented Apr 16, 2019

Hello, I am writing a paper using caxpy benchmark, could someone tell me what is the parallelization threshold value for Caxpy benchmark or somewhere I can find the info?
I feel that OpenMP is really called when the size of the vector is really small.

@brada4
Copy link
Contributor

brada4 commented Apr 16, 2019

Any considerate improvements are more than welcome, especially if they help your research...
I measured just once and it sort of pointed to something bigger than 10000
interface/axpy.c

#define MULTI_THREAD_MINIMAL  10000
  //Temporarily work-around the low performance issue with small imput size &
  //multithreads.
  if (incx == 0 || incy == 0 || n <= MULTI_THREAD_MINIMAL)
          nthreads = 1;

@brada4
Copy link
Contributor

brada4 commented Apr 16, 2019

Also see #1886 for other things help is needed.

@tianyizhangcs
Copy link
Author

Very helpful! Thank you so much

@brada4
Copy link
Contributor

brada4 commented Apr 16, 2019

add your research result as a pull request later if possible.

@tianyizhangcs
Copy link
Author

Sure, my research is mainly comparing hpxMP

 https://github.com/STEllAR-GROUP/hpxMP

with llvm-OpenMP and GOMP.

As the benchmark is using OpenMP, so I employed as a performance comparison.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants