As pointed out in https://github.com/dotnet/runtime/pull/112728#discussion_r1966340688, there are some SIMD instructions that have equivalents with smaller encoding. For example, `pupcklpd` can be replaced with `movlhps`, which is 1 byte smaller, under the following conditions: 1) both operands are from register 2) VEX encoding is not supported Similarly, `vpermilps` can be replaced with the smaller `vpshufd` if mixing float and integer domain instructions is not a concern. llvm has logic that identifies some of these replacements [here](https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/X86/X86FixupInstTuning.cpp) cc @tannergooding