Skip to content

Conversation

jakobbotsch
Copy link
Member

We may need to sign/zero-extend when we do this replacement very late.
Normally this cast would be inserted in morph.

cc @dotnet/jit-contrib @SingleAccretion

Fix #58373

We may need to sign/zero-extend when we do this replacement very late.
Normally this cast would be inserted in morph.

Fix dotnet#58373
@ghost ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Sep 2, 2021
@ghost
Copy link

ghost commented Sep 2, 2021

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

We may need to sign/zero-extend when we do this replacement very late.
Normally this cast would be inserted in morph.

cc @dotnet/jit-contrib @SingleAccretion

Fix #58373

Author: jakobbotsch
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@jakobbotsch
Copy link
Member Author

Only a small handful of diffs:

benchmarks.run.windows.x64.checked.mch:


Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 956
Total bytes of diff: 961
Total bytes of delta: 5 (0,52% of base)
    diff is a regression.
Detail diffs


Top file regressions (bytes):
           3 : 15746.dasm (0,38% of base)
           2 : 15745.dasm (1,27% of base)

2 total files with Code Size differences (0 improved, 2 regressed), 0 unchanged.

Top method regressions (bytes):
           3 ( 0,38% of base) : 15746.dasm - System.Formats.Cbor.CborReader:PeekInitialByte():System.Formats.Cbor.CborInitialByte:this
           2 ( 1,27% of base) : 15745.dasm - System.Formats.Cbor.CborReader:PeekInitialByte(ubyte):System.Formats.Cbor.CborInitialByte:this

Top method regressions (percentages):
           2 ( 1,27% of base) : 15745.dasm - System.Formats.Cbor.CborReader:PeekInitialByte(ubyte):System.Formats.Cbor.CborInitialByte:this
           3 ( 0,38% of base) : 15746.dasm - System.Formats.Cbor.CborReader:PeekInitialByte():System.Formats.Cbor.CborInitialByte:this

2 total methods with Code Size differences (0 improved, 2 regressed), 0 unchanged.


coreclr_tests.pmi.windows.x64.checked.mch:


Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 18
Total bytes of diff: 31
Total bytes of delta: 13 (72,22% of base)
    diff is a regression.
Detail diffs


Top file regressions (bytes):
           4 : 81631.dasm (80,00% of base)
           3 : 81628.dasm (75,00% of base)
           3 : 81452.dasm (100,00% of base)
           3 : 213629.dasm (50,00% of base)

4 total files with Code Size differences (0 improved, 4 regressed), 0 unchanged.

Top method regressions (bytes):
           4 (80,00% of base) : 81631.dasm - Wrapper`1[Int16][System.Int16]:op_Implicit(short):Wrapper`1[Int16]
           3 (75,00% of base) : 81628.dasm - Wrapper`1[Byte][System.Byte]:op_Implicit(ubyte):Wrapper`1[Byte]
           3 (100,00% of base) : 81452.dasm - GitHub_18522:M113():S0
           3 (50,00% of base) : 213629.dasm - SingleByte:Get():SingleByte

Top method regressions (percentages):
           3 (100,00% of base) : 81452.dasm - GitHub_18522:M113():S0
           4 (80,00% of base) : 81631.dasm - Wrapper`1[Int16][System.Int16]:op_Implicit(short):Wrapper`1[Int16]
           3 (75,00% of base) : 81628.dasm - Wrapper`1[Byte][System.Byte]:op_Implicit(ubyte):Wrapper`1[Byte]
           3 (50,00% of base) : 213629.dasm - SingleByte:Get():SingleByte

4 total methods with Code Size differences (0 improved, 4 regressed), 0 unchanged.


libraries.crossgen2.windows.x64.checked.mch:


Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 2045
Total bytes of diff: 2133
Total bytes of delta: 88 (4,30% of base)
    diff is a regression.
Detail diffs


Top file regressions (bytes):
           7 : 95561.dasm (7,07% of base)
           7 : 95560.dasm (7,37% of base)
           6 : 50717.dasm (2,73% of base)
           5 : 95603.dasm (2,02% of base)
           4 : 95604.dasm (1,65% of base)
           4 : 129153.dasm (22,22% of base)
           4 : 129154.dasm (66,67% of base)
           4 : 129152.dasm (26,67% of base)
           3 : 95570.dasm (50,00% of base)
           3 : 95650.dasm (50,00% of base)
           3 : 116464.dasm (6,98% of base)
           3 : 95530.dasm (50,00% of base)
           3 : 95647.dasm (50,00% of base)
           3 : 95646.dasm (50,00% of base)
           3 : 95648.dasm (50,00% of base)
           3 : 95531.dasm (50,00% of base)
           3 : 95574.dasm (50,00% of base)
           3 : 95645.dasm (50,00% of base)
           3 : 95573.dasm (50,00% of base)
           3 : 95575.dasm (50,00% of base)

27 total files with Code Size differences (0 improved, 27 regressed), 0 unchanged.

Top method regressions (bytes):
           7 ( 7,07% of base) : 95561.dasm - System.Half:System.IFloatingPoint<System.Half>.BitIncrement(System.Half):System.Half
           7 ( 7,37% of base) : 95560.dasm - System.Half:System.IFloatingPoint<System.Half>.BitDecrement(System.Half):System.Half
           6 ( 2,73% of base) : 50717.dasm - System.Data.Common.SqlConvert:ConvertToSqlBoolean(System.Object):System.Data.SqlTypes.SqlBoolean
           5 ( 2,02% of base) : 95603.dasm - System.Half:op_Explicit(double):System.Half
           4 ( 1,65% of base) : 95604.dasm - System.Half:op_Explicit(float):System.Half
           4 (22,22% of base) : 129153.dasm - Microsoft.VisualBasic.FileSystem:TAB(short):Microsoft.VisualBasic.TabInfo
           4 (66,67% of base) : 129154.dasm - Microsoft.VisualBasic.FileSystem:TAB():Microsoft.VisualBasic.TabInfo
           4 (26,67% of base) : 129152.dasm - Microsoft.VisualBasic.FileSystem:SPC(short):Microsoft.VisualBasic.SpcInfo
           3 (50,00% of base) : 95570.dasm - System.Half:System.IFloatingPoint<System.Half>.get_PositiveInfinity():System.Half
           3 (50,00% of base) : 95650.dasm - System.Half:get_Epsilon():System.Half
           3 ( 6,98% of base) : 116464.dasm - System.Reflection.Metadata.BlobReader:ReadSignatureHeader():System.Reflection.Metadata.SignatureHeader:this
           3 (50,00% of base) : 95530.dasm - System.Half:System.IMinMaxValue<System.Half>.get_MaxValue():System.Half
           3 (50,00% of base) : 95647.dasm - System.Half:get_NaN():System.Half
           3 (50,00% of base) : 95646.dasm - System.Half:get_MinValue():System.Half
           3 (50,00% of base) : 95648.dasm - System.Half:get_NegativeInfinity():System.Half
           3 (50,00% of base) : 95531.dasm - System.Half:System.IMinMaxValue<System.Half>.get_MinValue():System.Half
           3 (50,00% of base) : 95574.dasm - System.Half:System.IFloatingPoint<System.Half>.get_NaN():System.Half
           3 (50,00% of base) : 95645.dasm - System.Half:get_MaxValue():System.Half
           3 (50,00% of base) : 95573.dasm - System.Half:System.IFloatingPoint<System.Half>.get_NegativeInfinity():System.Half
           3 (50,00% of base) : 95575.dasm - System.Half:System.IFloatingPoint<System.Half>.get_Epsilon():System.Half

Top method regressions (percentages):
           4 (66,67% of base) : 129154.dasm - Microsoft.VisualBasic.FileSystem:TAB():Microsoft.VisualBasic.TabInfo
           3 (50,00% of base) : 95570.dasm - System.Half:System.IFloatingPoint<System.Half>.get_PositiveInfinity():System.Half
           3 (50,00% of base) : 95650.dasm - System.Half:get_Epsilon():System.Half
           3 (50,00% of base) : 95530.dasm - System.Half:System.IMinMaxValue<System.Half>.get_MaxValue():System.Half
           3 (50,00% of base) : 95647.dasm - System.Half:get_NaN():System.Half
           3 (50,00% of base) : 95646.dasm - System.Half:get_MinValue():System.Half
           3 (50,00% of base) : 95648.dasm - System.Half:get_NegativeInfinity():System.Half
           3 (50,00% of base) : 95531.dasm - System.Half:System.IMinMaxValue<System.Half>.get_MinValue():System.Half
           3 (50,00% of base) : 95574.dasm - System.Half:System.IFloatingPoint<System.Half>.get_NaN():System.Half
           3 (50,00% of base) : 95645.dasm - System.Half:get_MaxValue():System.Half
           3 (50,00% of base) : 95573.dasm - System.Half:System.IFloatingPoint<System.Half>.get_NegativeInfinity():System.Half
           3 (50,00% of base) : 95575.dasm - System.Half:System.IFloatingPoint<System.Half>.get_Epsilon():System.Half
           3 (50,00% of base) : 95649.dasm - System.Half:get_PositiveInfinity():System.Half
           1 (33,33% of base) : 97006.dasm - System.BitConverter:HalfToInt16Bits(System.Half):short
           4 (26,67% of base) : 129152.dasm - Microsoft.VisualBasic.FileSystem:SPC(short):Microsoft.VisualBasic.SpcInfo
           4 (22,22% of base) : 129153.dasm - Microsoft.VisualBasic.FileSystem:TAB(short):Microsoft.VisualBasic.TabInfo
           7 ( 7,37% of base) : 95560.dasm - System.Half:System.IFloatingPoint<System.Half>.BitDecrement(System.Half):System.Half
           7 ( 7,07% of base) : 95561.dasm - System.Half:System.IFloatingPoint<System.Half>.BitIncrement(System.Half):System.Half
           3 ( 6,98% of base) : 116464.dasm - System.Reflection.Metadata.BlobReader:ReadSignatureHeader():System.Reflection.Metadata.SignatureHeader:this
           1 ( 2,94% of base) : 95600.dasm - System.Half:Negate(System.Half):System.Half

27 total methods with Code Size differences (0 improved, 27 regressed), 0 unchanged.


libraries.pmi.windows.x64.checked.mch:


Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 2212
Total bytes of diff: 2258
Total bytes of delta: 46 (2,08% of base)
    diff is a regression.
Detail diffs


Top file regressions (bytes):
           6 : 115361.dasm (2,38% of base)
           4 : 105054.dasm (66,67% of base)
           4 : 105056.dasm (26,67% of base)
           4 : 2741.dasm (80,00% of base)
           4 : 16231.dasm (80,00% of base)
           4 : 105055.dasm (22,22% of base)
           3 : 192389.dasm (0,38% of base)
           3 : 147063.dasm (7,14% of base)
           3 : 16230.dasm (75,00% of base)
           3 : 16142.dasm (60,00% of base)
           3 : 2739.dasm (75,00% of base)
           2 : 192390.dasm (1,27% of base)
           1 : 192312.dasm (33,33% of base)
           1 : 192331.dasm (0,55% of base)
           1 : 192370.dasm (0,14% of base)

15 total files with Code Size differences (0 improved, 15 regressed), 0 unchanged.

Top method regressions (bytes):
           6 ( 2,38% of base) : 115361.dasm - System.Data.Common.SqlConvert:ConvertToSqlBoolean(System.Object):System.Data.SqlTypes.SqlBoolean
           4 (66,67% of base) : 105054.dasm - Microsoft.VisualBasic.FileSystem:TAB():Microsoft.VisualBasic.TabInfo
           4 (26,67% of base) : 105056.dasm - Microsoft.VisualBasic.FileSystem:SPC(short):Microsoft.VisualBasic.SpcInfo
           4 (80,00% of base) : 2741.dasm - dictRefType@165[Int16][System.Int16]:Invoke(short):StructBox`1[Int16]:this
           4 (80,00% of base) : 16231.dasm - System.ValueTuple:Create(short):System.ValueTuple`1[Int16]
           4 (22,22% of base) : 105055.dasm - Microsoft.VisualBasic.FileSystem:TAB(short):Microsoft.VisualBasic.TabInfo
           3 ( 0,38% of base) : 192389.dasm - System.Formats.Cbor.CborReader:PeekInitialByte():System.Formats.Cbor.CborInitialByte:this
           3 ( 7,14% of base) : 147063.dasm - System.Reflection.Metadata.BlobReader:ReadSignatureHeader():System.Reflection.Metadata.SignatureHeader:this
           3 (75,00% of base) : 16230.dasm - System.ValueTuple:Create(ubyte):System.ValueTuple`1[Byte]
           3 (60,00% of base) : 16142.dasm - System.TupleExtensions:ToValueTuple(System.Tuple`1[Byte]):System.ValueTuple`1[Byte]
           3 (75,00% of base) : 2739.dasm - dictRefType@165[Byte][System.Byte]:Invoke(ubyte):StructBox`1[Byte]:this
           2 ( 1,27% of base) : 192390.dasm - System.Formats.Cbor.CborReader:PeekInitialByte(ubyte):System.Formats.Cbor.CborInitialByte:this
           1 (33,33% of base) : 192312.dasm - System.Formats.Cbor.HalfHelpers:HalfToInt16Bits(System.Half):short
           1 ( 0,55% of base) : 192331.dasm - System.Formats.Cbor.CborReader:<ReadIndefiniteLengthStringChunkRanges>g__ReadNextInitialByte|98_0(System.ReadOnlySpan`1[Byte],ubyte):System.Formats.Cbor.CborInitialByte
           1 ( 0,14% of base) : 192370.dasm - System.Formats.Cbor.CborReader:ReadHalf():System.Half:this

Top method regressions (percentages):
           4 (80,00% of base) : 2741.dasm - dictRefType@165[Int16][System.Int16]:Invoke(short):StructBox`1[Int16]:this
           4 (80,00% of base) : 16231.dasm - System.ValueTuple:Create(short):System.ValueTuple`1[Int16]
           3 (75,00% of base) : 16230.dasm - System.ValueTuple:Create(ubyte):System.ValueTuple`1[Byte]
           3 (75,00% of base) : 2739.dasm - dictRefType@165[Byte][System.Byte]:Invoke(ubyte):StructBox`1[Byte]:this
           4 (66,67% of base) : 105054.dasm - Microsoft.VisualBasic.FileSystem:TAB():Microsoft.VisualBasic.TabInfo
           3 (60,00% of base) : 16142.dasm - System.TupleExtensions:ToValueTuple(System.Tuple`1[Byte]):System.ValueTuple`1[Byte]
           1 (33,33% of base) : 192312.dasm - System.Formats.Cbor.HalfHelpers:HalfToInt16Bits(System.Half):short
           4 (26,67% of base) : 105056.dasm - Microsoft.VisualBasic.FileSystem:SPC(short):Microsoft.VisualBasic.SpcInfo
           4 (22,22% of base) : 105055.dasm - Microsoft.VisualBasic.FileSystem:TAB(short):Microsoft.VisualBasic.TabInfo
           3 ( 7,14% of base) : 147063.dasm - System.Reflection.Metadata.BlobReader:ReadSignatureHeader():System.Reflection.Metadata.SignatureHeader:this
           6 ( 2,38% of base) : 115361.dasm - System.Data.Common.SqlConvert:ConvertToSqlBoolean(System.Object):System.Data.SqlTypes.SqlBoolean
           2 ( 1,27% of base) : 192390.dasm - System.Formats.Cbor.CborReader:PeekInitialByte(ubyte):System.Formats.Cbor.CborInitialByte:this
           1 ( 0,55% of base) : 192331.dasm - System.Formats.Cbor.CborReader:<ReadIndefiniteLengthStringChunkRanges>g__ReadNextInitialByte|98_0(System.ReadOnlySpan`1[Byte],ubyte):System.Formats.Cbor.CborInitialByte
           3 ( 0,38% of base) : 192389.dasm - System.Formats.Cbor.CborReader:PeekInitialByte():System.Formats.Cbor.CborInitialByte:this
           1 ( 0,14% of base) : 192370.dasm - System.Formats.Cbor.CborReader:ReadHalf():System.Half:this

15 total methods with Code Size differences (0 improved, 15 regressed), 0 unchanged.


libraries_tests.pmi.windows.x64.checked.mch:


Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 109
Total bytes of diff: 160
Total bytes of delta: 51 (46,79% of base)
    diff is a regression.
Detail diffs


Top file regressions (bytes):
           4 : 130350.dasm (80,00% of base)
           4 : 282983.dasm (80,00% of base)
           4 : 124315.dasm (80,00% of base)
           4 : 154562.dasm (80,00% of base)
           4 : 279565.dasm (80,00% of base)
           4 : 17120.dasm (80,00% of base)
           4 : 281197.dasm (80,00% of base)
           3 : 130347.dasm (75,00% of base)
           3 : 281196.dasm (75,00% of base)
           3 : 154561.dasm (75,00% of base)
           3 : 124312.dasm (75,00% of base)
           3 : 279564.dasm (75,00% of base)
           3 : 282982.dasm (75,00% of base)
           3 : 17119.dasm (75,00% of base)
           1 : 236402.dasm (4,35% of base)
           1 : 240545.dasm (4,35% of base)

16 total files with Code Size differences (0 improved, 16 regressed), 0 unchanged.

Top method regressions (bytes):
           4 (80,00% of base) : 130350.dasm - Microsoft.Build.Shared.NGen`1[Int16][System.Int16]:op_Implicit(short):Microsoft.Build.Shared.NGen`1[Int16]
           4 (80,00% of base) : 282983.dasm - System.Collections.Tests.ValueComparable:Create(short):System.Collections.Tests.ValueComparable`1[Int16]
           4 (80,00% of base) : 124315.dasm - Microsoft.Build.Shared.NGen`1[Int16][System.Int16]:op_Implicit(short):Microsoft.Build.Shared.NGen`1[Int16]
           4 (80,00% of base) : 154562.dasm - System.Collections.Tests.ValueComparable:Create(short):System.Collections.Tests.ValueComparable`1[Int16]
           4 (80,00% of base) : 279565.dasm - System.Collections.Tests.ValueComparable:Create(short):System.Collections.Tests.ValueComparable`1[Int16]
           4 (80,00% of base) : 17120.dasm - System.Collections.Tests.ValueComparable:Create(short):System.Collections.Tests.ValueComparable`1[Int16]
           4 (80,00% of base) : 281197.dasm - System.Collections.Tests.ValueComparable:Create(short):System.Collections.Tests.ValueComparable`1[Int16]
           3 (75,00% of base) : 130347.dasm - Microsoft.Build.Shared.NGen`1[Byte][System.Byte]:op_Implicit(ubyte):Microsoft.Build.Shared.NGen`1[Byte]
           3 (75,00% of base) : 281196.dasm - System.Collections.Tests.ValueComparable:Create(ubyte):System.Collections.Tests.ValueComparable`1[Byte]
           3 (75,00% of base) : 154561.dasm - System.Collections.Tests.ValueComparable:Create(ubyte):System.Collections.Tests.ValueComparable`1[Byte]
           3 (75,00% of base) : 124312.dasm - Microsoft.Build.Shared.NGen`1[Byte][System.Byte]:op_Implicit(ubyte):Microsoft.Build.Shared.NGen`1[Byte]
           3 (75,00% of base) : 279564.dasm - System.Collections.Tests.ValueComparable:Create(ubyte):System.Collections.Tests.ValueComparable`1[Byte]
           3 (75,00% of base) : 282982.dasm - System.Collections.Tests.ValueComparable:Create(ubyte):System.Collections.Tests.ValueComparable`1[Byte]
           3 (75,00% of base) : 17119.dasm - System.Collections.Tests.ValueComparable:Create(ubyte):System.Collections.Tests.ValueComparable`1[Byte]
           1 ( 4,35% of base) : 236402.dasm - Microsoft.SqlServer.Server.SqlMetaData:Adjust(System.Data.SqlTypes.SqlBoolean):System.Data.SqlTypes.SqlBoolean:this
           1 ( 4,35% of base) : 240545.dasm - Microsoft.SqlServer.Server.SqlMetaData:Adjust(System.Data.SqlTypes.SqlBoolean):System.Data.SqlTypes.SqlBoolean:this

Top method regressions (percentages):
           4 (80,00% of base) : 130350.dasm - Microsoft.Build.Shared.NGen`1[Int16][System.Int16]:op_Implicit(short):Microsoft.Build.Shared.NGen`1[Int16]
           4 (80,00% of base) : 282983.dasm - System.Collections.Tests.ValueComparable:Create(short):System.Collections.Tests.ValueComparable`1[Int16]
           4 (80,00% of base) : 124315.dasm - Microsoft.Build.Shared.NGen`1[Int16][System.Int16]:op_Implicit(short):Microsoft.Build.Shared.NGen`1[Int16]
           4 (80,00% of base) : 154562.dasm - System.Collections.Tests.ValueComparable:Create(short):System.Collections.Tests.ValueComparable`1[Int16]
           4 (80,00% of base) : 279565.dasm - System.Collections.Tests.ValueComparable:Create(short):System.Collections.Tests.ValueComparable`1[Int16]
           4 (80,00% of base) : 17120.dasm - System.Collections.Tests.ValueComparable:Create(short):System.Collections.Tests.ValueComparable`1[Int16]
           4 (80,00% of base) : 281197.dasm - System.Collections.Tests.ValueComparable:Create(short):System.Collections.Tests.ValueComparable`1[Int16]
           3 (75,00% of base) : 130347.dasm - Microsoft.Build.Shared.NGen`1[Byte][System.Byte]:op_Implicit(ubyte):Microsoft.Build.Shared.NGen`1[Byte]
           3 (75,00% of base) : 281196.dasm - System.Collections.Tests.ValueComparable:Create(ubyte):System.Collections.Tests.ValueComparable`1[Byte]
           3 (75,00% of base) : 154561.dasm - System.Collections.Tests.ValueComparable:Create(ubyte):System.Collections.Tests.ValueComparable`1[Byte]
           3 (75,00% of base) : 124312.dasm - Microsoft.Build.Shared.NGen`1[Byte][System.Byte]:op_Implicit(ubyte):Microsoft.Build.Shared.NGen`1[Byte]
           3 (75,00% of base) : 279564.dasm - System.Collections.Tests.ValueComparable:Create(ubyte):System.Collections.Tests.ValueComparable`1[Byte]
           3 (75,00% of base) : 282982.dasm - System.Collections.Tests.ValueComparable:Create(ubyte):System.Collections.Tests.ValueComparable`1[Byte]
           3 (75,00% of base) : 17119.dasm - System.Collections.Tests.ValueComparable:Create(ubyte):System.Collections.Tests.ValueComparable`1[Byte]
           1 ( 4,35% of base) : 236402.dasm - Microsoft.SqlServer.Server.SqlMetaData:Adjust(System.Data.SqlTypes.SqlBoolean):System.Data.SqlTypes.SqlBoolean:this
           1 ( 4,35% of base) : 240545.dasm - Microsoft.SqlServer.Server.SqlMetaData:Adjust(System.Data.SqlTypes.SqlBoolean):System.Data.SqlTypes.SqlBoolean:this

16 total methods with Code Size differences (0 improved, 16 regressed), 0 unchanged.


@jakobbotsch
Copy link
Member Author

jakobbotsch commented Sep 2, 2021

The typical diff is that we insert an extra (usually un-)necessary sign/zero extension.

(unnecessary)

--- "a/D:\\dev\\dotnet\\spmi\\asm.libraries.pmi.windows.x64.checked.3\\base\\105054.dasm"
+++ "b/D:\\dev\\dotnet\\spmi\\asm.libraries.pmi.windows.x64.checked.3\\diff\\105054.dasm"
@@ -1,36 +1,37 @@
 ; Assembly listing for method Microsoft.VisualBasic.FileSystem:TAB():Microsoft.VisualBasic.TabInfo
 ; Emitting BLENDED_CODE for X64 CPU with AVX - Windows
 ; optimized code
 ; rsp based frame
 ; partially interruptible
 ; No matching PGO data
 ; Final local variable assignments
 ;
 ;* V00 loc0         [V00    ] (  0,  0   )  struct ( 8) zero-ref    ld-addr-op
 ;# V01 OutArgs      [V01    ] (  1,  1   )  lclBlk ( 0) [rsp+00H]   "OutgoingArgSpace"
 ;  V02 tmp1         [V02,T00] (  2,  2   )   short  ->  rax         single-def V00.Column(offs=0x00) P-INDEP "field V00.Column (fldOffset=0x0)"
 ;
 ; Lcl frame size = 0

 G_M14166_IG01:        ; gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, nogc <-- Prolog IG
                                                ;; bbWeight=1    PerfScore 0.00
 G_M14166_IG02:        ; gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
        mov      eax, -1
-                                               ;; bbWeight=1    PerfScore 0.25
+       movsx    rax, ax
+                                               ;; bbWeight=1    PerfScore 0.50
 G_M14166_IG03:        ; , epilog, nogc, extend
        ret
                                                ;; bbWeight=1    PerfScore 1.00

-; Total bytes of code 6, prolog size 0, PerfScore 1.85, instruction count 2, allocated bytes for code 6 (MethodHash=c26dc8a9) for method Microsoft.VisualBasic.FileSystem:TAB():Microsoft.VisualBasic.TabInfo
+; Total bytes of code 10, prolog size 0, PerfScore 2.50, instruction count 3, allocated bytes for code 10 (MethodHash=c26dc8a9) for method Microsoft.VisualBasic.FileSystem:TAB():Microsoft.VisualBasic.TabInfo

(necessary)

--- "a/D:\\dev\\dotnet\\spmi\\asm.libraries.pmi.windows.x64.checked.3\\base\\192312.dasm"
+++ "b/D:\\dev\\dotnet\\spmi\\asm.libraries.pmi.windows.x64.checked.3\\diff\\192312.dasm"
@@ -1,36 +1,36 @@
 ; Assembly listing for method System.Formats.Cbor.HalfHelpers:HalfToInt16Bits(System.Half):short
 ; Emitting BLENDED_CODE for X64 CPU with AVX - Windows
 ; optimized code
 ; rsp based frame
 ; partially interruptible
 ; No matching PGO data
 ; Final local variable assignments
 ;
 ;* V00 arg0         [V00    ] (  0,  0   )  struct ( 8) zero-ref    ld-addr-op single-def
 ;# V01 OutArgs      [V01    ] (  1,  1   )  lclBlk ( 0) [rsp+00H]   "OutgoingArgSpace"
 ;  V02 tmp1         [V02,T00] (  2,  2   )  ushort  ->  rcx         single-def V00._value(offs=0x00) P-INDEP "field V00._value (fldOffset=0x0)"
 ;
 ; Lcl frame size = 0

 G_M52880_IG01:        ; gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, nogc <-- Prolog IG
                                                ;; bbWeight=1    PerfScore 0.00
 G_M52880_IG02:        ; gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
-       mov      eax, ecx
+       movzx    rax, cx
                                                ;; bbWeight=1    PerfScore 0.25
 G_M52880_IG03:        ; , epilog, nogc, extend
        ret
                                                ;; bbWeight=1    PerfScore 1.00

-; Total bytes of code 3, prolog size 0, PerfScore 1.55, instruction count 2, allocated bytes for code 3 (MethodHash=57f6316f) for method System.Formats.Cbor.HalfHelpers:HalfToInt16Bits(System.Half):short
+; Total bytes of code 4, prolog size 0, PerfScore 1.65, instruction count 2, allocated bytes for code 4 (MethodHash=57f6316f) for method System.Formats.Cbor.HalfHelpers:HalfToInt16Bits(System.Half):short

Copy link
Contributor

@SingleAccretion SingleAccretion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I feared the regressions would be a lot worse, nice to see being wrong on that count. It is rather unfortunate we cannot insert this normalization in the front-end, but it is how it is.

Copy link
Contributor

@sandreenko sandreenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as you mentioned a better fix would be to fix the first transformation that produces incorrect results:

Morphing BB01 of 'Runtime_58373:HalfToInt16Bits(int,int,int,int,int,int,System.Half):short'

fgMorphTree BB01, STMT00000 (before)
               [000003] ---XG-------              *  RETURN    int
               [000002] *--XG-------              \--*  IND       short
               [000001] ------------                 \--*  ADDR      long
               [000000] -------N----                    \--*  LCL_VAR   struct<System.Half, 2>(P) V06 arg6
                                                        \--*    ushort V06._value (offs=0x00) -> V08 tmp1

fgMorphTree BB01, STMT00000 (after)
               [000003] ----G+------              *  RETURN    int
               [000000] -----+-N----              \--*  LCL_VAR   struct<System.Half, 2>(P) V06 arg6
                                                  \--*    ushort V06._value (offs=0x00) -> V08 tmp1

here we are losing cast from struct<2> to int (for structs the upper bytes are not required to be zeroed, so the cast is necessary).
However, if inserting the cast exposes other issues I agree with the current solution, just mark the places that should be changed back in main after backport.

@@ -0,0 +1,29 @@
// Licensed to the .NET Foundation under one or more agreements.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you please tell me which complus are needed to repro it on x64 windows?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must have made a blunder somewhere, this test does not repro it for me either. It seems on x64 we normalize when we move the arg to the outgoing arg area/into arg register. So we need a little bit more. This exposes it on x64:

using System;
using System.Runtime.CompilerServices;

public unsafe class Runtime_58373
{
    public static int Main()
    {
        short halfValue = HalfToInt16Bits(MakeHalf());
        int x = halfValue;
        short val2 = HalfToInt16Bits(*(Half*)&x);

        return halfValue == val2 ? 100 : -1;
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    static Half MakeHalf()
    {
        return (Half)(-1.0f);
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    static short HalfToInt16Bits(Half h)
    {
        return *(short*)&h;
    }
}  

This one exposes it on x86 without unsafe except for the reinterp:

using System;
using System.Runtime.CompilerServices;

public unsafe class Runtime_58373
{
    public static int Main()
    {
        // Use up a lot of registers
        int a = GetVal();
        int b = GetVal();
        int c = GetVal();
        int d = GetVal();
        int e = GetVal();
        int f = GetVal();
        int g = GetVal();
        int h = GetVal();
        int i = GetVal();

        short val1 = HalfToInt16Bits(MakeHalf());
        Half half = MakeHalf();
        MakeHalf(); // This will spill lower 16 bits of 'half' to memory
        short val2 = HalfToInt16Bits(half); // This will pass 32 bits as arg with upper 16 bits undefined

        return val1 == val2 ? 100 + a + b + c + d + e + f + g + h + i : -1;
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    static int GetVal()
    {
        return 0;
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    static Half MakeHalf()
    {
        return default;
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    static short HalfToInt16Bits(Half h)
    {
        return *(short*)&h;
    }
}  

I will change the test to use these cases.

@jakobbotsch
Copy link
Member Author

here we are losing cast from struct<2> to int (for structs the upper bytes are not required to be zeroed, so the cast is necessary).

Right, the reason we don't insert this cast is because fgMorphRetInd does not remove GTF_DONT_CSE from the local var (N in flags) and fgMorphLocalVar takes this to mean that the local is under GT_ADDR. However removing the flag leads to asserts because we do some constant propagation that we didn't before and we hit unreached() in the GT_CNS_DBL case in LowerRetStruct. My other fix was the following, but I need to look into the handling of GT_CNS_DBL:
https://github.com/dotnet/runtime/compare/main...jakobbotsch:fix-58373-2?expand=1

FWIW the other fix has more regressions than this one (but still very few). Although it seems nicer.

@JulieLeeMSFT JulieLeeMSFT added this to the 6.0.0 milestone Sep 3, 2021
@JulieLeeMSFT
Copy link
Member

Thanks @jakobbotsch. Please backport it to 6.0 once it is fixed.

@jakobbotsch
Copy link
Member Author

@sandreenko Do you think it would be better to fix the missing cast instead and implement the lowering support for GTF_CNS_DOUBLE even for backport?

@sandreenko
Copy link
Contributor

@sandreenko Do you think it would be better to fix the missing cast instead and implement the lowering support for GTF_CNS_DOUBLE even for backport?

I think your approach is right because it touches only 1 method in 1 file for backporting, so LGTM.

the lowering support for GTF_CNS_DOUBLE even for backport?

If I understand correctly to support it in lowering you will need to delete the case/assert for it and default will handle it:

default:
assert(varTypeIsEnregisterable(retVal));
if (varTypeUsesFloatReg(ret) != varTypeUsesFloatReg(retVal))
{
GenTree* bitcast = comp->gtNewBitCastNode(ret->TypeGet(), retVal);
ret->gtOp1 = bitcast;
BlockRange().InsertBefore(ret, bitcast);
ContainCheckBitCast(bitcast);
}
just not in the most efficient way but the code should be correct.

@jakobbotsch
Copy link
Member Author

If I understand correctly to support it in lowering you will need to delete the case/assert for it and default will handle it:

Thanks, that's great to know. Then I will submit a separate PR doing that for .NET 7 after this one.

@jakobbotsch
Copy link
Member Author

Hmm, looks like one of the tests still fails on ARM32, need to look into that before this is merged.

@jakobbotsch
Copy link
Member Author

The problem is the do-not-enregister case is also wrong, it changes it to a ret-typed LCL_FLD node. Which means we get:

N001 (  1,  1) [000000] -------N-----        t0 =    LCL_FLD   int    V00 arg0         [+0]
                                                  *    ushort V00._value (offs=0x00) -> V02 tmp1         
                                                   /--*  t0     int    
N002 (  2,  2) [000003] ----G--------             *  RETURN    int    $100

We hit this path because we only do dependent promotion for parameters on arm32 and we don't replace dependently-promoted locals with their fields.

If I understand correctly we should retype the LCL_FLD node to the native return type, not to the type of the ret node, to ensure we load the proper size.

Retyping it to the type of the ret node is not right when reinterpreting
small structs as a type that needs to be normalized.
@jakobbotsch
Copy link
Member Author

I added a fix for the not enregistered case (and fixed some outdated comments). No diffs on x64 and the only ARM32 diff is the fixed codegen in System.BitConverter:HalfToInt16Bits. Can you take another look @sandreenko?

@sandreenko
Copy link
Contributor

I added a fix for the not enregistered case (and fixed some outdated comments). No diffs on x64 and the only ARM32 diff is the fixed codegen in System.BitConverter:HalfToInt16Bits. Can you take another look @sandreenko?

are you talking about the total diffs from this PR? Do you think it meets 6.0 bar?

Copy link
Contributor

@sandreenko sandreenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for digging into this.

@jakobbotsch
Copy link
Member Author

are you talking about the total diffs from this PR? Do you think it meets 6.0 bar?

It was the extra diffs, the total ones are a bit bigger due to the cast in the promoted case.

I think this meets the bar since we are hitting the miscompilation in our own code in HalfToInt16Bits. It helps that the function normally is always inlined which masks the problem, but I'm not sure we can rely on that.

@jakobbotsch
Copy link
Member Author

cc @dotnet/jit-contrib, I need a MS org approval.

@jakobbotsch jakobbotsch requested a review from EgorBo September 10, 2021 14:21
@jakobbotsch jakobbotsch merged commit 7069930 into dotnet:main Sep 10, 2021
@jakobbotsch
Copy link
Member Author

/backport to release/6.0

@github-actions
Copy link
Contributor

Started backporting to release/6.0: https://github.com/dotnet/runtime/actions/runs/1221703276

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

JIT: pgo test failure in System.Runtime.Tests.dll

5 participants