Improve performance of SGEMM on Arm Cortex-A53 #2643
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
From : 128 To : 2048 Step=128 : Transa=N : Transb=N
SIZE Flops_Origin Flops_Optimized
M= 128, N= 128, K= 128 : 3116.12 MFlops 4249.55 MFlops
M= 256, N= 256, K= 256 : 4531.32 MFlops 5812.30 MFlops
M= 384, N= 384, K= 384 : 4790.65 MFlops 5822.73 MFlops
M= 512, N= 512, K= 512 : 5086.61 MFlops 6200.58 MFlops
M= 640, N= 640, K= 640 : 5147.45 MFlops 6221.90 MFlops
M= 768, N= 768, K= 768 : 5302.84 MFlops 6365.05 MFlops
M= 896, N= 896, K= 896 : 5294.63 MFlops 6337.23 MFlops
M=1024, N=1024, K=1024 : 5334.58 MFlops 6412.72 MFlops
M=1152, N=1152, K=1152 : 5380.47 MFlops 6402.58 MFlops
M=1280, N=1280, K=1280 : 5483.96 MFlops 6469.20 MFlops
M=1408, N=1408, K=1408 : 5441.53 MFlops 6446.06 MFlops
M=1536, N=1536, K=1536 : 5484.47 MFlops 6497.01 MFlops
M=1664, N=1664, K=1664 : 5473.48 MFlops 6476.32 MFlops
M=1792, N=1792, K=1792 : 5555.79 MFlops 6520.28 MFlops
M=1920, N=1920, K=1920 : 5527.43 MFlops 6492.62 MFlops
M=2048, N=2048, K=2048 : 5375.12 MFlops 6396.85 MFlops
Regards