i32x4.dot_i16x8_s and i32x4.dot_i16x8_add_s instructions

Maratyszcza · Maratyszcza · commit 28c01fba9a0d · 2019-10-31T23:37:01.000-07:00
diff --git a/proposals/simd/BinarySIMD.md b/proposals/simd/BinarySIMD.md
@@ -189,3 +189,5 @@ The `v8x16.shuffle` instruction has 16 bytes after `simdop`.
 | `i64x2.load32x2_s`         |    `0xd6`| m:memarg           |
 | `i64x2.load32x2_u`         |    `0xd7`| m:memarg           |
 | `v128.andnot`              |    `0xd8`| -                  |
+| `i32x4.dot_i16x8_s`        |    `0xd9`| -                  |
+| `i32x4.dot_i16x8_add_s`    |    `0xda`| -                  |
diff --git a/proposals/simd/ImplementationStatus.md b/proposals/simd/ImplementationStatus.md
@@ -100,6 +100,8 @@
 | `i16x8.sub_saturate_s`     |               `-msimd128` |    :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
 | `i16x8.sub_saturate_u`     |               `-msimd128` |    :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
 | `i16x8.mul`                |               `-msimd128` |    :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
+| `i16x8.dot2_s`             |                           |                       |                    |                    |
+| `i16x8.dot2add_s`          |                           |                       |                    |                    |
 | `i32x4.neg`                |               `-msimd128` |    :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
 | `i32x4.any_true`           |               `-msimd128` |    :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
 | `i32x4.all_true`           |               `-msimd128` |    :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
diff --git a/proposals/simd/SIMD.md b/proposals/simd/SIMD.md
@@ -380,6 +380,17 @@ def S.mul(a, b):
     return S.lanewise_binary(mul, a, b)
 ```
 
+### Integer dot product
+* `i32x4.dot_i16x8_s(a: v128, b: v128) -> v128`
+
+Lane-wise multiply signed 16-bit integers in the two input vectors and add adjacent pairs of the full 32-bit results.
+
+### Integer dot product with accumulation
+
+* `i32x4.dot_i16x8_add_s(a: v128, b: v128, c: v128) -> v128`
+
+Lane-wise multiply signed 16-bit integers in the two input vectors, add adjacent pairs of the full 32-bit results, and accumulate with corresponding 32-bit lanes of `c`. This operation is equivalent to `i32x4.add(i32x4.dot_i16x8_s(a, b), c)`.
+
 ### Integer negation
 * `i8x16.neg(a: v128) -> v128`
 * `i16x8.neg(a: v128) -> v128`