You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is a bug in all laswp functions I found in the files
lapack/laswp/generic/laswp*.c
lapack/laswp/generic/zlaswp*.c
kernel/generic/laswp_ncopy*.c
kernel/generic/zlaswp_ncopy*.c
The bug is related to the ipiv array. The functions do a read ahead
to this array which results in a read beyond the array bounds.
This is in fact wrong code but leads to problems only in rare cases
when it is not allowed to read the two positions after the end of the
ipiv array. The following example (Linux implementation) enforces this
behavior by aligning the array to page size and protect the page after
the array. The example causes a segfault.
#include <sys/mman.h>
#include <malloc.h>
#include <unistd.h>
void dgetrf_(int *m, int *n, double *a, int *lda, int *ipiv, int *info);
void* my_malloc(size_t size)
{
const size_t page_size = 4096;
const size_t num_pages = (size-1)/page_size + 1;
const size_t total_size = (num_pages + 2) * page_size;
char *p = (char *)memalign(page_size, total_size);
if (p == NULL) return 0;
mprotect(p + (num_pages+1)*page_size, page_size, PROT_NONE);
return p + page_size + num_pages*page_size-size;
}
int main()
{
int i, j, info;
int n = 20;
double *A = (double *)my_malloc(n*n * sizeof(double));
int *ipiv = (int *)my_malloc(n * sizeof(int));
for(j=0; j<n; j++){
for(i=0; i<n; i++) A[i + n*j] = (i==j)? 1 : 0;
}
dgetrf_(&n, &n, A, &n, ipiv, &info);
return 0;
}
I fixed it in my fork of OpenBLAS by
the lapack/laswp/generic/laswp__.c functions are replaced.
Code is essentially taken from the LAPACK reference implementation.
the lapack/getrf/getrf__.c functions use the alternative
formulation which does not use the LASWP_NCOPY functions
note that the kernel/generic/laswp_ncopy*.c related functions are still wrong
but not longer used.
The text was updated successfully, but these errors were encountered:
However, I don't want to replace the code with LAPACK reference implementation. I think Goto's laswp achieve the better performance. I am going to fix this bug in next week. Could you help me testing it?
—
Reply to this email directly or view it on GitHub #130 (comment).
Hi Xianyi,
I did a sequence of getrf+getrs tests using the current OpenBLAS-lib.
All results are fine in my tests and no segfault occured, so
the implementation seems to be correct now.
There is a bug in all laswp functions I found in the files
The bug is related to the ipiv array. The functions do a read ahead
to this array which results in a read beyond the array bounds.
This is in fact wrong code but leads to problems only in rare cases
when it is not allowed to read the two positions after the end of the
ipiv array. The following example (Linux implementation) enforces this
behavior by aligning the array to page size and protect the page after
the array. The example causes a segfault.
I fixed it in my fork of OpenBLAS by
Code is essentially taken from the LAPACK reference implementation.
formulation which does not use the LASWP_NCOPY functions
note that the kernel/generic/laswp_ncopy*.c related functions are still wrong
but not longer used.
The text was updated successfully, but these errors were encountered: