Skip to content
This repository was archived by the owner on Dec 22, 2021. It is now read-only.

Commit ba9c3c1

Browse files
committed
i32x4.dot_i16x8_s and i32x4.dot_i16x8_add_s instructions
1 parent 77e7fda commit ba9c3c1

File tree

3 files changed

+15
-0
lines changed

3 files changed

+15
-0
lines changed

proposals/simd/BinarySIMD.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -199,6 +199,8 @@ The `v8x16.shuffle` instruction has 16 bytes after `simdop`.
199199
| `v128.andnot` | `0xd8`| - |
200200
| `i8x16.avgr_u` | `0xd9`| |
201201
| `i16x8.avgr_u` | `0xda`| |
202+
| `i32x4.dot_i16x8_s` | `0xdb`| - |
203+
| `i32x4.dot_i16x8_add_s` | `0xdc`| - |
202204
| `i8x16.abs` | `0xe1`| - |
203205
| `i16x8.abs` | `0xe2`| - |
204206
| `i32x4.abs` | `0xe3`| - |

proposals/simd/ImplementationStatus.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -126,6 +126,8 @@
126126
| `i32x4.max_s` | `-msimd128` | :heavy_check_mark: | :heavy_check_mark: | |
127127
| `i32x4.max_u` | `-msimd128` | :heavy_check_mark: | :heavy_check_mark: | |
128128
| `i32x4.abs` | | | | |
129+
| `i32x4.dot_i16x8_s` | | | | |
130+
| `i32x4.dot_i16x8_add_s` | | | | |
129131
| `i64x2.neg` | `-msimd128` | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
130132
| `i64x2.shl` | `-msimd128` | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
131133
| `i64x2.shr_s` | `-msimd128` | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |

proposals/simd/SIMD.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -380,6 +380,17 @@ def S.mul(a, b):
380380
return S.lanewise_binary(mul, a, b)
381381
```
382382

383+
### Integer dot product
384+
* `i32x4.dot_i16x8_s(a: v128, b: v128) -> v128`
385+
386+
Lane-wise multiply signed 16-bit integers in the two input vectors and add adjacent pairs of the full 32-bit results.
387+
388+
### Integer dot product with accumulation
389+
390+
* `i32x4.dot_i16x8_add_s(a: v128, b: v128, c: v128) -> v128`
391+
392+
Lane-wise multiply signed 16-bit integers in the two input vectors, add adjacent pairs of the full 32-bit results, and accumulate with corresponding 32-bit lanes of `c`. This operation is equivalent to `i32x4.add(i32x4.dot_i16x8_s(a, b), c)`.
393+
383394
### Integer negation
384395
* `i8x16.neg(a: v128) -> v128`
385396
* `i16x8.neg(a: v128) -> v128`

0 commit comments

Comments
 (0)