Skip to content

Program is Terminated. Because you tried to allocate too many memory regions. #539

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
danpovey opened this issue Apr 12, 2015 · 22 comments
Closed

Comments

@danpovey
Copy link

Hi,
I would appreciate it if you could explain why OpenBLAS crashes if you build it with OPENBLAS_NUM_THREADS=2 and then try to call it from multi-threaded code. Is this by design? Or maybe due to the limitations of the machine we're running it on?
Dan

/home/ubuntu/workspace/kaldi/src/online2bin/online2-wav-nnet2-latgen-faster --online=true --do-endpointing=false --config=exp/nnet1/conf/online_nnet2_decoding.conf --max-active=3000 --beam=8.0 --lattice-beam=4.0 --acoustic-scale=0.07 --word-symbol-table=exp/nnet1/graph/words.txt exp/nnet1/final.mdl exp/nnet1/graph/HCLG.fst ark:data/test/split1/1/spk2utt 'ark,s,cs:wav-copy scp,p:data/test/split1/1/wav.scp ark:- |' ark:/dev/null
LOG (online2-wav-nnet2-latgen-faster:ComputeDerivedVars():ivector-extractor.cc:180) Computing derived variables for iVector extractor
[New Thread 0x7fffedfa1700 (LWP 26871)]
[New Thread 0x7fffed7a0700 (LWP 26872)]
[New Thread 0x7fffecf9f700 (LWP 26873)]
[Thread 0x7fffedfa1700 (LWP 26871) exited]
[New Thread 0x7fffdffff700 (LWP 26874)]
[Thread 0x7fffed7a0700 (LWP 26872) exited]
[New Thread 0x7fffdf7fe700 (LWP 26875)]
[Thread 0x7fffecf9f700 (LWP 26873) exited]
[New Thread 0x7fffed7a0700 (LWP 26876)]
[New Thread 0x7fffedfa1700 (LWP 26877)]
[New Thread 0x7fffecf9f700 (LWP 26878)]
[New Thread 0x7fffdeffd700 (LWP 26879)]
[Thread 0x7fffdffff700 (LWP 26874) exited]
[Thread 0x7fffdf7fe700 (LWP 26875) exited]
[New Thread 0x7fffde7fc700 (LWP 26880)]
[New Thread 0x7fffdffff700 (LWP 26881)][Thread 0x7fffed7a0700 (LWP 26876) exited][New Thread 0x7fffddffb700 (LWP 26882)]
[New Thread 0x7fffdf7fe700 (LWP 26883)][Thread 0x7fffedfa1700 (LWP 26877) exited][New Thread 0x7fffdd7fa700 (LWP 26884)]
[New Thread 0x7fffed7a0700 (LWP 26885)]BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
Program received signal SIGSEGV, Segmentation fault.[Switching to Thread 0x7fffed7a0700 (LWP 26885)]0x00007ffff259f024 in dcopy_k () from /home/ubuntu/workspace/kaldi/tools/OpenBLAS/install/lib/libopenblas.so.0
(gdb) bt
#0 0x00007ffff259f024 in dcopy_k () from /home/ubuntu/workspace/kaldi/tools/OpenBLAS/install/lib/libopenblas.so.0
#1 0x00007ffff238cd1e in dspmv_U () from /home/ubuntu/workspace/kaldi/tools/OpenBLAS/install/lib/libopenblas.so.0#2 0x00007ffff2355653 in cblas_dspmv () from /home/ubuntu/workspace/kaldi/tools/OpenBLAS/install/lib/libopenblas.so.0#3 0x00007ffff53e15f9 in kaldi::cblas_Xspmv (dim=40, alpha=1, Mdata=0x16fb440, ydata=0x7ae370, ystride=100, beta=0, xdata=0x7fffc400a6a0, xstride=1)

at ../matrix/cblas-wrappers.h:133#4  0x00007ffff53e972e in kaldi::SpMatrix<double>::AddMat2Sp (this=0x7fffed79fca0, alpha=1, M=..., transM=kaldi::kTrans, A=..., beta=0) at sp-matrix.cc:1042#5  0x00007ffff7895e20 in kaldi::IvectorExtractor::ComputeDerivedVars (this=0x7fffffffdcf8, i=14) at ivector-extractor.cc:208
@jeromerobert
Copy link
Contributor

Did you build OpenBLAS with -frecursive ? dspmv is part of Lapack and, with gcc, Lapack is thread safe only if built with -frecursive or -fopenmp. See https://github.com/xianyi/OpenBLAS/blob/develop/Makefile.rule#L152.

@danpovey
Copy link
Author

No we didn't.
Is there a command line argument to "make" that I can add, in order to use
this?
Dan

On Sun, Apr 12, 2015 at 12:25 PM, Jerome Robert [email protected]
wrote:

Did you build OpenBLAS with -frecursive ? dspmv is part of Lapack and,
with gcc, Lapack is thread safe only if built with -frecursive or -fopenmp.
See https://github.com/xianyi/OpenBLAS/blob/develop/Makefile.rule#L152.


Reply to this email directly or view it on GitHub
#539 (comment).

@jeromerobert
Copy link
Contributor

Yes.

FCOMMON_OPT=-frecursive

@danpovey
Copy link
Author

That doesn't seem to work right, because then the options
-Wall -m64 -fPIC
then don't seem to get added to FCOMMON_OPT, and I later get a linking
error saying to add -fPIC.
Uncommenting the line in Makefile.rule seems to be the best way to do this.

Dan

On Sun, Apr 12, 2015 at 2:17 PM, Jerome Robert [email protected]
wrote:

Yes.

FCOMMON_OPT=-frecursive


Reply to this email directly or view it on GitHub
#539 (comment).

@xianyi
Copy link
Collaborator

xianyi commented Apr 13, 2015

@danpovey

In OpenBLAS, we mange a pool of memory buffers and allocate the number of buffers as the following.
#define NUM_BUFFERS (MAX_CPU_NUMBER * 2)
For your case, it exceeded the number of buffers.

Could you build OpenBLAS with larger NUM_THREADS? For example, make NUM_THREADS=32
In Makefile.system, we will set MAX_CPU_NUMBER=NUM_THREADS.

@danpovey
Copy link
Author

OK, thanks.
Dan

On Mon, Apr 13, 2015 at 10:55 AM, Zhang Xianyi [email protected]
wrote:

@danpovey https://github.com/danpovey

In OpenBLAS, we mange a pool of memory buffers and allocate the number of
buffers as the following.

#define NUM_BUFFERS (MAX_CPU_NUMBER * 2)

For your case, it exceeded the number of buffers.

Could you build OpenBLAS with larger NUM_THREADS? For example, make
NUM_THREADS=32
In Makefile.system, we will set MAX_CPU_NUMBER=NUM_THREADS.


Reply to this email directly or view it on GitHub
#539 (comment).

@xianyi
Copy link
Collaborator

xianyi commented Apr 17, 2015

@jakirkham
Copy link
Contributor

Could we maybe just have -frecursive added to the default Fortran options? That seems to work fine for me.

@jakirkham
Copy link
Contributor

Should add that I see the same behavior that @danpovey notes with FCOMMON_OPT resulting in dropped arguments. However, if I use FFLAGS, that seems to get the right option set without overriding/dropping the other options.

@VictorRodriguez
Copy link

Hi

I am facing the same problem despite the fact that these are my make flags in openblas :

make TARGET=HASWELL F_COMPILER=GFORTRAN SHARED=1 DYNAMIC_THREADS=1 USE_OPENMP=1 NUM_THREADS=128 %{?_smp_mflags}

I am using this spec file:
https://github.com/clearlinux-pkgs/openblas/blob/master/openblas.spec

The test that im running is:
python scipy/interpolate/tests/test_interpnd.py
BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
BLAS : Program is Terminated. Because you tried to allocate too many memory regions
Segmentation fault (core dumped)

I have try with USE_OPENMP=0 USE_THREAD=1 NUM_THREADS=128 with the same result

Thanks a lot

@martin-frbg
Copy link
Collaborator

What kind of hardware are you trying to run this on ? You could try changing the calculation of NUM_BUFFERS in common.h (currently NUM_THREADS * 2) to see if it is as simple as that or
if the 128/256 is some magic number imposed by a bitmap somewhere deep down in the code.

@martin-frbg
Copy link
Collaborator

@VictorRodriguez also which version are you using - the current "develop" branch from git, or some older release ? Some thread safety fixes for the traversal of the buffers list went in around the new year.

@martin-frbg
Copy link
Collaborator

Also see if adding "-frecursive" to the fortran compiler options (by uncommenting the FCOMMON_OPT line in Makefile.rule helps (as suggested above)

@VictorRodriguez
Copy link

@martin-frbg thanks for the feedback I'm using Version 0.2.19 version . Let me try what you suggest , thanks a lot

@jakirkham
Copy link
Contributor

FYI @VictorRodriguez had better luck with FFLAGS otherwise other arguments seem to get dropped. Though your experience might not be the same.

@martin-frbg
Copy link
Collaborator

With the current code, I think a user-defined FCOMMON_OPT should only gain additional settings in Makefile.system. (In the past, you might have run into a situation where the default "-O2" optimization level was applied to the fortran part only if FCOMMON_OPT was previously undefined but this has been fixed in early january as well, i.e. post 0.2.19)

@jakirkham
Copy link
Contributor

What about -fPIC or -m64?

@martin-frbg
Copy link
Collaborator

I think these are only appended to whatever FCOMMON_OPT is seen by Makefile.system (FCOMMON_OPT += -m64 etc) - may need revisiting to make sure that it actually does what it appears to be doing though.

@VictorRodriguez
Copy link

VictorRodriguez commented Apr 7, 2017

Hi team

Thanks a lot for the help, however I haven't been able to fix this issue. I have been done this changes on my spec file:

diff --git a/openblas.spec b/openblas.spec
index 93850f8..e6cbae7 100644
--- a/openblas.spec
+++ b/openblas.spec
@@ -45,7 +45,7 @@ OpenBLAS is an optimized linear algebra library.
export AR=gcc-ar
export RANLIB=gcc-ranlib
export CFLAGS="$CFLAGS -flto -fno-semantic-interposition -O3 -fPIC"
-export FFLAGS="$CFLAGS -flto -fno-semantic-interposition -O3 -fno-f2c -frecursive -fPIC"
+export FFLAGS="-O3 -frecursive -fPIC"
export CXXFLAGS="$CXXFLAGS -flto -fno-semantic-interposition -O3 -fPIC"

sed -i -e "s/-O2/-O3/g" Makefile*
@@ -58,7 +58,7 @@ pushd ..
make TARGET=SANDYBRIDGE F_COMPILER=GFORTRAN SHARED=1 DYNAMIC_THREADS=1 NUM_THREADS=256 %{?_smp_mflags}
popd
export CFLAGS="$CFLAGS -march=haswell "
-export FFLAGS="$FFLAGS -march=haswell -O3 "
+export FFLAGS="-O3 -frecursive -fPIC"
pushd openblas-avx2
make TARGET=HASWELL F_COMPILER=GFORTRAN SHARED=1 DYNAMIC_THREADS=1 USE_OPENMP=1 NUM_THREADS=128 %{?_smp_mflags}
popd

Then I re run this code crom scipy ( I am thinking to rebuild scipy ):

https://github.com/scipy/scipy/blob/master/scipy/interpolate/tests/test_interpnd.py

Is there a test for this case in openblas source code? ( in C for example )

Thanks a lot for the help

@martin-frbg
Copy link
Collaborator

I am not aware of any specific testcase in the source, my low tech approach would be

  1. check that your specially built OpenBLAS is actually what your scipy code is loading (it might access some older version somewhere in your library search path, perhaps the spec file you are using even assigns some different name to the final library file than what scipy expects...)
  2. try to find out how many threads the scipy code spawns
  3. increase the NUM_THREADS for your build beyond the 128/256 you have now
    (btw a silly comment just in case - you seem to be editing a rpm-style spec file, but never mentioned using rpmbuild or similar to actually build the library using these changes ?)

@jakirkham
Copy link
Contributor

I think these are only appended to whatever FCOMMON_OPT is seen by Makefile.system (FCOMMON_OPT += -m64 etc) - may need revisiting to make sure that it actually does what it appears to be doing though.

Part of the reason I started using FFLAGS instead of FCOMMON_OPT is that using FCOMMON_OPT was resulting in other flags getting dropped in 0.2.19. Not sure what the current situation is in master though.

@icarus-sparry
Copy link

It turns out that the issue @VictorRodriguez was having was due to someone cherry-picking 84b8170 into the source tree. This commit was effectively reverted in dd6212e - sorry for the noise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants