-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Closed
Labels
area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMICLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMIenhancementProduct code improvement that does NOT require public API changes/additionsProduct code improvement that does NOT require public API changes/additions
Milestone
Description
This code:
av1 = Fma.MultiplyAdd(iv1, Sse.LoadVector128(mp + 4), av1);
currently compiles to:
lea rbx,[rdi+10h]
vfmadd132ps xmm4,xmm1,xmmword ptr [rbx]
vmovaps xmm1,xmm4
Assuming dotnet/coreclr#22944 would eliminate the extra lea
there, I believe this should be generating:
vfmadd231ps xmm1,xmm4,xmmword ptr [rdi+10h]
It looks like the logic in genFMAIntrinsic
is missing the fact the two non-contained arguments could be swapped here.
category:cq
theme:hardware-intrinsics
skill-level:expert
cost:medium
Sergio0694
Metadata
Metadata
Assignees
Labels
area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMICLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMIenhancementProduct code improvement that does NOT require public API changes/additionsProduct code improvement that does NOT require public API changes/additions