Skip to content

Bug in laswp and laswp_ncopy functions #130

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dennis12 opened this issue Jul 26, 2012 · 3 comments
Closed

Bug in laswp and laswp_ncopy functions #130

dennis12 opened this issue Jul 26, 2012 · 3 comments
Assignees
Labels
Milestone

Comments

@dennis12
Copy link

There is a bug in all laswp functions I found in the files

  • lapack/laswp/generic/laswp*.c
  • lapack/laswp/generic/zlaswp*.c
  • kernel/generic/laswp_ncopy*.c
  • kernel/generic/zlaswp_ncopy*.c

The bug is related to the ipiv array. The functions do a read ahead
to this array which results in a read beyond the array bounds.
This is in fact wrong code but leads to problems only in rare cases
when it is not allowed to read the two positions after the end of the
ipiv array. The following example (Linux implementation) enforces this
behavior by aligning the array to page size and protect the page after
the array. The example causes a segfault.

#include <sys/mman.h>
#include <malloc.h>
#include <unistd.h>


void dgetrf_(int *m, int *n, double *a, int *lda, int *ipiv, int *info);


void* my_malloc(size_t size)
{
  const size_t page_size = 4096;
  const size_t num_pages = (size-1)/page_size + 1;
  const size_t total_size = (num_pages + 2) * page_size;
  char *p = (char *)memalign(page_size, total_size);
  if (p == NULL) return 0; 
  mprotect(p + (num_pages+1)*page_size, page_size, PROT_NONE);  
  return p + page_size + num_pages*page_size-size;
}


int main()
{
  int i, j, info;
  int n = 20;

  double *A = (double *)my_malloc(n*n * sizeof(double));
  int *ipiv = (int *)my_malloc(n * sizeof(int));

  for(j=0; j<n; j++){
     for(i=0; i<n; i++) A[i + n*j] = (i==j)? 1 : 0;
  }

  dgetrf_(&n, &n, A, &n, ipiv, &info);

  return 0;
}

I fixed it in my fork of OpenBLAS by

  1. the lapack/laswp/generic/laswp__.c functions are replaced.
    Code is essentially taken from the LAPACK reference implementation.
  2. the lapack/getrf/getrf__.c functions use the alternative
    formulation which does not use the LASWP_NCOPY functions

note that the kernel/generic/laswp_ncopy*.c related functions are still wrong
but not longer used.

@ghost ghost assigned xianyi Jul 28, 2012
@xianyi
Copy link
Collaborator

xianyi commented Jul 28, 2012

Hi @dennis12 ,

Thank you for your report.

However, I don't want to replace the code with LAPACK reference implementation. I think Goto's laswp achieve the better performance. I am going to fix this bug in next week. Could you help me testing it?

Thanks

Xianyi

xianyi added a commit that referenced this issue Aug 9, 2012
@xianyi
Copy link
Collaborator

xianyi commented Aug 9, 2012

Hi @dennis12 ,

I think I fixed this bug. Could you verify it?

Thanks

Xianyi

@dennis12
Copy link
Author

dennis12 commented Aug 9, 2012

On 08/09/2012 02:08 PM, Xianyi Zhang wrote:

Hi @dennis12 https://github.com/dennis12 ,

I think I fixed this bug. Could you verify it?

Thanks

Xianyi


Reply to this email directly or view it on GitHub
#130 (comment).

Hi Xianyi,

I did a sequence of getrf+getrs tests using the current OpenBLAS-lib.
All results are fine in my tests and no segfault occured, so
the implementation seems to be correct now.

Best regards,
Dennis

@xianyi xianyi closed this as completed Aug 10, 2012
kseniyazaytseva pushed a commit to kseniyazaytseva/OpenBLAS that referenced this issue Dec 18, 2023
…y, iamin

Merge in PL/openblas from dev/k.zaytseva/LM-656 to dev-riscv
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants