Open
Description
diff --git a/src/libasr/ASR.asdl b/src/libasr/ASR.asdl
index 26e60e172..d6a29ecef 100644
--- a/src/libasr/ASR.asdl
+++ b/src/libasr/ASR.asdl
@@ -420,6 +420,7 @@ array_physical_type
= DescriptorArray
| PointerToDataArray
| FixedSizeArray
+ | SIMDArray
| NumPyArray
| ISODescriptorArray
We'll use Annotated:
from typing import Annotated
from lpython import f32, SIMD
x: Annotated[f32[64], SIMD]
In ASR we use SIMDArray
physical type, and then in the LLVM backend (or ASR->ASR pass) we ensure all such arrays get vectorized, otherwise we give a compile time error message. The conditions are:
- Must be 1D array
- Array element type and all operations must directly map into hardware instructions. For example on Apple M1 CPU, f32 plus and multiply is supported (it will compile), but f16 multiply is not (compile time error on Apple M1)
- Fixed compile time size
- The size must be a multiple of the hardware vector length. If we have 512 bits (AVX-512), the sizeof(element)size must be equal to 512n for n=1, 2, 3, .... If n=1, then the array is directly stored in a register. For n>2, the loop is unrolled (since the size is known at compile time), ensuring we hit maximum compute throughput (hide IO and latency).
Metadata
Metadata
Assignees
Labels
No labels