-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Closed
Description
Hi!!!
I'm using mumps with openblas 0.2.18 compiled from sources with this options:
- gcc 5.3.0
- USE_THREAD = 1
- NUM_THREADS = 128
- NO_WARMUP = 1
- #NO_AFFINITY = 1
- #BIGNUMA = 1
- MAX_STACK_ALLOC = 8128
I'm running my code in some machines multisocket (2 or 4) with 24 until 128 cores
The problem is that most of the time is executed by the system. To be more acurate is doing not I/O is doing Context switch. You can see it in the below pictures
This doesn't happend in 1 socket machine, but if I force to use just one CPU (taskset -c 0-31 MyAPP) the performance is also poor.
What can I do to give you more information and try to help¿¿¿
Metadata
Metadata
Assignees
Labels
No labels