-
Notifications
You must be signed in to change notification settings - Fork 1.6k
openblas runs on a single thread with OMP_PROC_BIND=TRUE on fedora #3435
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Could you try different scripts from |
Without OMP overrides, there is no issue. Only shows up with OMP_*. The scripts you showed have the same behavior (fine without OMP_*, single thread with OMP_PROC_BIND) |
Probably another data point in the saga of how to count the number of cpus - get_num_procs() in driver/others/memory.c |
tentative fix refined to use omp_get_num_places() now (should be better than just going with SC_NPROCESSORS_CONF) |
I ran into the same issue on Ubuntu 20.04 using the sudo apt update
sudo apt install build-essential libopenblas-openmp-dev numactl wget
cd ~/Downloads
wget https://www.lanl.gov/projects/crossroads/_assets/docs/micro/mtdgemm-crossroads-v1.0.0.tgz
tar xzf mtdgemm-crossroads-v1.0.0.tgz
cd mt-dgemm/src
gcc -o mt-dgemm-openblas mt-dgemm.c -mtune=znver2 -march=znver2 -mavx2 -lm -fopenmp -Ofast -ffp-contract=fast -funroll-loops -I/usr/include/x86_64-linux-gnu/openblas-openmp /usr/lib/x86_64-linux-gnu/openblas-openmp/libopenblas.a -lpthread -lm -DUSE_CBLAS
OMP_NUM_THREADS=32 OMP_PROC_BIND=close OMP_PLACES=cores ./mt-dgemm-openblas 8192 4
GFLOP/s rate: 1381.483761 GF/s
OMP_NUM_THREADS=32 numactl -C 0-31 ./mt-dgemm-openblas 8192 4 # performed similarly to
OMP_NUM_THREADS=32 numactl --physcpubind=0-31 ./mt-dgemm-openblas 8192 4 # ditto
OMP_NUM_THREADS=32 ./mt-dgemm-openblas 8192 4
GFLOP/s rate: 1257.241689 GF/s Using OpenBLAS, the dgemm example benefits from setting For reference, the following utilizes one core. OMP_NUM_THREADS=32 OMP_PROC_BIND=true OMP_PLACES="$( seq -s },{ 0 1 31 | sed -e 's/\(.*\)/\{\1\}/' )" ./mt-dgemm-openblas 8192 4
OMP_NUM_THREADS=32 GOMP_CPU_AFFINITY=0-31:1 ./mt-dgemm-openblas 8192 4 |
OS: fedora 35
CPU: epyc 7502p
with numpy linked to openblas openmp 0.3.18, when OMP_PROC_BIND=TRUE, openblas runs on a single thread. This is similar to #2238.
gdb shows blas_cpu_number is indeed set to 32 (physical core count). However taskset shows the affinity mask of the openblas process is set to "1". Manually override affinity mask with taskset -pc 0-63 changes the affinity mask, but openblas is still running on a single thread.
Hello world program (https://www.geeksforgeeks.org/openmp-hello-world-program/) with OMP_PROC_BIND=TRUE has expected (0-63) affinity mask on this system.
The text was updated successfully, but these errors were encountered: