Skip to content

OpenBLAS/lapack-netlib 'make' numerical errors in ARMv7 multi-threaded only #597

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Jagmn opened this issue Jun 15, 2015 · 10 comments
Closed

Comments

@Jagmn
Copy link

Jagmn commented Jun 15, 2015

When running a 'make' in the 'lapack-netlib' directory of OpenBLAS on an ARMv7 target, I get different numerical failures when using multi-threading.

--- ARMv7 (Raspberry Pi 2) x 4 (no OpenMP), built with 'make TARGET=ARMV7'
            -->   LAPACK TESTING SUMMARY  <--
        Processing LAPACK Testing output found in the TESTING direcory
SUMMARY                 nb test run     numerical error     other error  
================    =========== =================   ================  
REAL                528599      11  (0.002%)    0   (0.000%)    
DOUBLE PRECISION    605768      1   (0.000%)    0   (0.000%)    
COMPLEX             434073      114 (0.026%)    0   (0.000%)    
COMPLEX16           522687      70  (0.013%)    0   (0.000%)    

--> ALL PRECISIONS  2091127     196 (0.009%)    0   (0.000%)    


--- ARMv7 (Raspberry Pi 2) x 4 (no OpenMP), OPENBLAS_NUM_THREADS=1 
            -->   LAPACK TESTING SUMMARY  <--
        Processing LAPACK Testing output found in the TESTING direcory
SUMMARY                 nb test run     numerical error     other error  
================    =========== =================   ================  
REAL                1109189     0   (0.000%)    0   (0.000%)    
DOUBLE PRECISION    1110527     0   (0.000%)    0   (0.000%)    
COMPLEX             581782      0   (0.000%)    0   (0.000%)    
COMPLEX16           582594      0   (0.000%)    0   (0.000%)    

--> ALL PRECISIONS  3384092     0   (0.000%)    0   (0.000%)    
@Jagmn
Copy link
Author

Jagmn commented Jun 15, 2015

I should have commented: There are no numerical errors (or any other kinds of errors) on any of my x86_64 targets regardless of number of threads.

@martin-frbg
Copy link
Collaborator

Is this with the 0.2.14 release or current git "develop" branch ? (If the former, could you retest with the latter just in case ?)

@Jagmn
Copy link
Author

Jagmn commented Jun 15, 2015

Both, I'm afraid. Though the 0.2.14 release deadlocks occasionally during the test (or appears to).

@xianyi
Copy link
Collaborator

xianyi commented Jun 23, 2015

@wernsaar ,

I just run lapack test on our lab's ARM Cortex-A15 board.

The result of 4 threads (without USE_OPENMP)

            -->   LAPACK TESTING SUMMARY  <--
SUMMARY                 nb test run     numerical error     other error  
================    =========== =================   ================  
REAL                441944      194 (0.044%)    2   (0.000%)    
DOUBLE PRECISION    559990      155 (0.028%)    1   (0.000%)    
COMPLEX             487187      80  (0.016%)    0   (0.000%)    
COMPLEX16           491663      56  (0.011%)    0   (0.000%)    

--> ALL PRECISIONS  1980784     485 (0.024%)    3   (0.000%)

The result of 1 thread

            -->   LAPACK TESTING SUMMARY  <--
SUMMARY                 nb test run     numerical error     other error  
================    =========== =================   ================  
REAL                1109189     0   (0.000%)    0   (0.000%)    
DOUBLE PRECISION    1110527     0   (0.000%)    0   (0.000%)    
COMPLEX             581782      0   (0.000%)    0   (0.000%)    
COMPLEX16           582594      0   (0.000%)    0   (0.000%)    

--> ALL PRECISIONS  3384092     0   (0.000%)    0   (0.000%)

The result of 4 OpenMP threads (USE_OPENMP=1)

            -->   LAPACK TESTING SUMMARY  <--
SUMMARY                 nb test run     numerical error     other error  
================    =========== =================   ================  
REAL                1109189     0   (0.000%)    0   (0.000%)    
DOUBLE PRECISION    1110527     0   (0.000%)    0   (0.000%)    
COMPLEX             561202      2   (0.000%)    2   (0.000%)    
COMPLEX16           582594      0   (0.000%)    0   (0.000%)    

--> ALL PRECISIONS  3363512     2   (0.000%)    2   (0.000%)

@wernsaar
Copy link
Contributor

Yesterday I also ran some tests on the Cortex-A15 machine with latest source from github with 0 errors, with and without OpenMP. Before you run lapack-test, you have to increase the stack size limit
( ulimit -s 16384 ).
After you build OpenBLAS, first wait some minutes before you run the test. If you see errors, repeat a test after some minutes in the directory lapack-netlib by running for example
./lapack_testing.py -p s -r for single precision

@xianyi
Copy link
Collaborator

xianyi commented Jun 27, 2015

@wernsaar , Thank you for the investigate.

@Jagmn
Copy link
Author

Jagmn commented Jul 2, 2015

Hi @wernsaar,

I've rebuilt and re-run with the latest commit (3f1b576) and set ulimit -s unlimited and I'm coming up with the same results that I was previously. This is on a 4xCortex-A7 (the A15's little brother) on a Raspberry Pi.

I've also run a test on an ARMv8-A target (ARM's Juno Development Platform) and can report that OpenBLAS now compiles and runs. Only the OpenMP build will run correctly in multi-threaded mode, the pthreads build still produces numerical errors (this is with ulimit -s unlimited).

notaz added a commit to notaz/OpenBLAS that referenced this issue Aug 16, 2015
@notaz notaz mentioned this issue Aug 16, 2015
@buffer51
Copy link
Contributor

Hi,

Is there an update on this? Did anyone successfully test lapack-netlib on ARMv7?

@xianyi
Copy link
Collaborator

xianyi commented Nov 11, 2015

@buffer51 , I just compiled develop branch on ARM Cortex-A15 Linux.

            -->   LAPACK TESTING SUMMARY  <--
SUMMARY                 nb test run     numerical error     other error  
================    =========== =================   ================  
REAL                1109189     0   (0.000%)    0   (0.000%)    
DOUBLE PRECISION    1110527     0   (0.000%)    0   (0.000%)    
COMPLEX             581782      0   (0.000%)    0   (0.000%)    
COMPLEX16           582594      0   (0.000%)    0   (0.000%)    

--> ALL PRECISIONS  3384092     0   (0.000%)    0   (0.000%)

@buffer51
Copy link
Contributor

Hm.. So this is only an issue on Android (or maybe just my device..)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants