dgesv produces incorrect answer when compiling the library using CMake with USE_OPENMP=1 #2396

heilokchow · 2020-02-09T08:08:25Z

When I use OpenBLAS to do solve some high dimensional linear equations using LAPACKE_dgesv, I use CMake to compile the OpenBLAS library libopenblas.a with -DUSE_OPENMP=1 since my program itself uses OpenMP for parallelism. My CMake build command is (unix system):

cmake -DUSE_OPENMP=1 <openblas directory with CMakeLists.txt>
cmake --build <current binary directory>

The computing process doesn't generate any warning or bugs. First, I get vector b by b = A*x, then, I solve x by using dgesv function with parameter A and b. If I compare the original x and solved x, when dimension is low, it works fine with no error. However, when dimension is high, lets say n = 100, the solved x is inconsistent with original x and the error is very large. If I use make USE_OPENMP=1 to recompile libopenblas.a using original (not CMake generated) Makefile and link my program with this library, the result of dgesv function is always correct. My testing code is shown below:

#include <stdio.h>
#include <cblas.h>
#include <lapacke.h>
#include <random>
#include <omp.h>

int main ()
{
    int n = 1000;
    int nrep = 20;
    double sum_error = 0.0;

    for (int k = 0; k < nrep; k++){
        double* A = new double[n*n];
        double* x = new double[n];
        double* b = new double[n];
        int* ipiv = new int[n];
        double sum = 0;

        std::random_device device;
        std::mt19937 generator(device());
        std::normal_distribution<double> normal(0.0, 1.0);

        for (int i = 0; i < n*n; i++)
            A[i] = normal(generator);

        for (int i = 0; i < n; i++)
            x[i] = normal(generator);

        cblas_dgemv(CblasRowMajor, CblasNoTrans, n, n, 1.0, A, n, x, 1, 0.0, b, 1);
        LAPACKE_dgesv(LAPACK_ROW_MAJOR, n, 1, A, n, ipiv, b, 1);

        for (int i = 0; i < n; i++)
            sum += abs(b[i] - x[i]);
        
        sum_error += sum;

        delete[] A;
        delete[] x;
        delete[] b;
        delete[] ipiv;
    }

    printf("Error: %f\n", sum_error);

    return 0;
}

The text was updated successfully, but these errors were encountered:

brada4 · 2020-02-09T20:37:47Z

What processor? Which OpenBLAS version?
Test case does not break with cores up to Sandybridge, omp or pthread, 0.3.7
EDIT: compiled with make, not cmake

heilokchow · 2020-02-10T03:43:50Z

Both my zen and kabylake processors will have the same issue. I use OpenBLAS version 0.3.7 for testing. Yep, directly compile with make USE_OPENMP=1 won't generate such problem. So is it caused by the experimental support of cmake?

brada4 · 2020-02-10T05:26:01Z

Probably this report will keep experimental label for another release or two.
Could you post at least cmake and make (or other backend used) versions, compiler versions etc.
It seems similar if not same as #2395 , maybe compare findings over there.
At least you have workaround with make

heilokchow · 2020-02-10T10:05:33Z

my cmake version is 3.16, gnu make version is 4.1 and gcc version is 7.3.0. I test the code on Ubuntu 18.04 and 19.04. I will try cmake on other platforms to see whether similar problem will occur.

brada4 · 2020-02-10T11:19:34Z

If you say make works, it could be worth testing versions with radically outdated/updated cmake makefile generator. The "old way" sort of works very consistently for decade or so.

heilokchow · 2020-02-11T04:17:44Z

Update: I test that if I perform a normal compiling with cmake on unix system using pthread instead of omp, The dgesv function's result is also incorrect (when n is large). I use cmake 2.8.12 and cmake 3.16. Both of them will produce same error. Here is the building logs for cmake and standard make.

for cmake (under a build directory):

cmake -G "Unix Makefiles" -DCMAKE_BUILD_TYPE=Release -DCMAKE_VERBOSE_MAKEFILE=ON ../../OpenBLAS-0.3.7 >> cmake_log.txt 2>&1
make >> cmake_log.txt 2>&1

cmake_log.txt

for make (under /OpenBLAS-0.3.7 directory)

make >> make_log.txt 2>&1

make_log.txt

brada4 · 2020-02-11T04:36:14Z

How big are exact parameters? http://www.netlib.org/lapack/lapack-3.1.1/html/dgesv.f.html
The issue could be insufficiently robust integer arithmetic leading to pointer problems.

heilokchow · 2020-02-11T04:55:16Z

On my device, when n=100, the sum_error in the test code above is quite large (around 1000).

brada4 · 2020-02-11T06:54:03Z

Does it correct itself id you make N divisible by 16? i.e if some asm kernel misfires down the hood?

martin-frbg · 2020-02-11T07:02:59Z

@brada4 stop please - we know already that it does work correctly when built with gmake, so the error must be somewhere in the cmake rules rather than the actual code.
(As OpenBLAS reimplements gesv with getrf/getrs, it is probably in the generator rules for the TRSM kernels. Yesterday's fix for TRMM does not appear to have solved it)

marxin · 2020-02-14T11:25:16Z

@heilokchow
Can you please describe how you compile and link the test from top-level OpenBLAS folder?
The exact command line would be preferred.

martin-frbg · 2020-02-14T11:53:04Z

@marxin the problem is completely reproducible (with or without USE_OPENMP) with just plain cmake .. (which is why I said "stop" above). So far I have traced it to the OpenBLAS reimplementation of getf2 in lapack/getf2/getf2_k.c (which uses iamax, gemv_n, swap and scal kernels) but have not yet narrowed it down further.

marxin · 2020-02-14T15:30:39Z

@marxin the problem is completely reproducible (with or without USE_OPENMP) with just plain cmake .. (which is why I said "stop" above). So far I have traced it to the OpenBLAS reimplementation of getf2 in lapack/getf2/getf2_k.c (which uses iamax, gemv_n, swap and scal kernels) but have not yet narrowed it down further.

All right! What we can do is to compare assemblies and see where's a difference.
Can you please provide steps on how to link to the code snippet?

marxin · 2020-02-14T16:46:06Z

Can you please provide steps on how to link to the code snippet?

I was missing some system header files. Now I can link it against a statically build library.
The binary has only about 100K, so one should be able to see a difference in the assembly.

martin-frbg · 2020-02-15T17:34:17Z

At least part of the problem seems to be getrf_single getting built without the -DUNIT. Trying a fix for lapack/CMakeLists.txt now.

martin-frbg added this to the 0.3.9 milestone Feb 10, 2020

martin-frbg mentioned this issue Feb 14, 2020

Fix iamax sse implementation and add utests #2414

Merged

martin-frbg mentioned this issue Feb 15, 2020

Correct generation of GETRF files by the CMAKE build #2420

Merged

martin-frbg closed this as completed in #2420 Feb 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dgesv produces incorrect answer when compiling the library using CMake with USE_OPENMP=1 #2396

dgesv produces incorrect answer when compiling the library using CMake with USE_OPENMP=1 #2396

heilokchow commented Feb 9, 2020

brada4 commented Feb 9, 2020 •

edited

Loading

heilokchow commented Feb 10, 2020

brada4 commented Feb 10, 2020

heilokchow commented Feb 10, 2020

brada4 commented Feb 10, 2020

heilokchow commented Feb 11, 2020

brada4 commented Feb 11, 2020 •

edited

Loading

heilokchow commented Feb 11, 2020

brada4 commented Feb 11, 2020

martin-frbg commented Feb 11, 2020

marxin commented Feb 14, 2020

martin-frbg commented Feb 14, 2020

marxin commented Feb 14, 2020

marxin commented Feb 14, 2020

martin-frbg commented Feb 15, 2020

dgesv produces incorrect answer when compiling the library using CMake with USE_OPENMP=1 #2396

dgesv produces incorrect answer when compiling the library using CMake with USE_OPENMP=1 #2396

Comments

heilokchow commented Feb 9, 2020

brada4 commented Feb 9, 2020 • edited Loading

heilokchow commented Feb 10, 2020

brada4 commented Feb 10, 2020

heilokchow commented Feb 10, 2020

brada4 commented Feb 10, 2020

heilokchow commented Feb 11, 2020

brada4 commented Feb 11, 2020 • edited Loading

heilokchow commented Feb 11, 2020

brada4 commented Feb 11, 2020

martin-frbg commented Feb 11, 2020

marxin commented Feb 14, 2020

martin-frbg commented Feb 14, 2020

marxin commented Feb 14, 2020

marxin commented Feb 14, 2020

martin-frbg commented Feb 15, 2020

brada4 commented Feb 9, 2020 •

edited

Loading

brada4 commented Feb 11, 2020 •

edited

Loading