-
Notifications
You must be signed in to change notification settings - Fork 1.6k
ARM64v8-TSV110: dgemm sigfaults on large square matrices single thread #2538
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks - ultimately this is probably "just" another manifestation of the fundamental problem from #1698 (BUFFER_SIZE should be more easily configurable at compile time, if it cannot go away completely). The current values are mostly from the GotoBLAS of ten years ago, and must have been quite arbitrary even then. |
Well that escalated quickly... seems same problem affects SkylakeX and ARMV7 targets at least. |
Hmm. Either the formula is wrong or there is something fundamentally wrong with the buffer allocation on POWER8. I just do not see how BUFFER_SIZE can ever be large enough to accomodate the value derived in the absence of a predefined xGEMM_DEFAULT_R... |
One thing that seems to be clear is that the calculation of the size requirement in the error message is going to overflow when xGEMM_DEFAULT:_R is itself a notrivial formula rather than a constant. (Which appears to have been the case specifically for POWERx platforms ever since GotoBLAS2, but I have no idea why) |
Description:
Crash in dgemm_oncopy(). This is because sb parameter points out of allocated buffer. And the pointer is calculated in driver/level3/level3.c:350 as (sb + min_l * (jjs - js) * COMPSIZE * l1stride), where jjs is close to 4000 and min_l is 512. jjs limited by GEMM_R and it is GEMM_R == DGEMM_DEFAULT_R == 4096.
To reproduce:
OS: CentOS Linux release 7.6.1810 (AltArch)
CPU: TSV110
Compiler: gcc 9.2.0
Commands:
make TARGET=TSV110 USE_OPENMP=yes
cd benchmarks; make
ulimit -s unlimited
OMP_NUM_THREADS=1 ./dgemm.goto 4000 10000 1000
Root cause:
BUFFER_SIZE is too small for P, Q and R.
Suggested fix
Add to common_arm64.h:
And in the beginning of memory.c add a buffer size guard like this:
Note
This issue mostlikly also affect EMAG8180 target, possible others - the guard should notify about that.
The text was updated successfully, but these errors were encountered: