-
Notifications
You must be signed in to change notification settings - Fork 5.2k
[RISC-V] Single bit checks #117461
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[RISC-V] Single bit checks #117461
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
Diffs are based on 297,254 contexts (102,759 MinOpts, 194,495 FullOpts). Overall (-362,976 bytes)
MinOpts (+0 bytes)
FullOpts (-362,976 bytes)
Example diffslinux.riscv64.Checked.1.mch-24 (-30.00%) : 47851.dasm - Microsoft.CodeAnalysis.VisualBasic.Symbols.Metadata.PE.PEMethodSymbol+PackedFlags:TryGetHasSetsRequiredMembers(byref):bool:this (FullOpts)@@ -25,26 +25,19 @@ G_M17699_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0C00 {a0 a1}, b
; byrRegs +[a0-a1]
lw a0, 0xD1FFAB1E(a0)
; byrRegs -[a0]
- lui a2, 0xD1FFAB1E
- and a2, a0, a2
- sext.w a2, a2
- sext.w a2, a2
- sltu a2, zero, a2
+ slli a2, a0, 49
+ srli a2, a2, 63
sb a2, 0xD1FFAB1E(a1)
- lui a1, 0xD1FFAB1E
- ; byrRegs -[a1]
- and a0, a0, a1
- sext.w a0, a0
- sext.w a0, a0
- sltu a0, zero, a0
- ;; size=48 bbWeight=1 PerfScore 12.00
+ slli a0, a0, 48
+ srli a0, a0, 63
+ ;; size=24 bbWeight=1 PerfScore 8.00
G_M17699_IG03: ; bbWeight=1, epilog, nogc, extend
ld ra, 8(sp)
ld fp, 0(sp)
addi sp, sp, 16
ret ;; size=16 bbWeight=1 PerfScore 7.50
-; Total bytes of code 80, prolog size 16, PerfScore 28.50, instruction count 20, allocated bytes for code 80 (MethodHash=f9a2badc) for method Microsoft.CodeAnalysis.VisualBasic.Symbols.Metadata.PE.PEMethodSymbol+PackedFlags:TryGetHasSetsRequiredMembers(byref):bool:this (FullOpts)
+; Total bytes of code 56, prolog size 16, PerfScore 24.50, instruction count 14, allocated bytes for code 56 (MethodHash=f9a2badc) for method Microsoft.CodeAnalysis.VisualBasic.Symbols.Metadata.PE.PEMethodSymbol+PackedFlags:TryGetHasSetsRequiredMembers(byref):bool:this (FullOpts)
; ============================================================
Unwind Info:
@@ -55,7 +48,7 @@ Unwind Info:
E bit : 0
X bit : 0
Vers : 0
- Function Length : 20 (0x00014) Actual length = 80 (0x000050)
+ Function Length : 14 (0x0000e) Actual length = 56 (0x000038)
---- Epilog scopes ----
---- Scope 0
Epilog Start Offset : 3523193630 (0xd1ffab1e) Actual offset = 3523193630 (0xd1ffab1e) Offset from main function begin = 3523193630 (0xd1ffab1e) -16 (-26.67%) : 26370.dasm - Microsoft.CodeAnalysis.VisualBasic.Symbols.SynthesizedDelegateMethodSymbol:get_IsOverridable():bool:this (FullOpts)@@ -23,20 +23,16 @@ G_M2731_IG02: ; bbWeight=1, gcrefRegs=0400 {a0}, byrefRegs=0000 {}, byref
; gcrRegs +[a0]
lw a0, 0xD1FFAB1E(a0)
; gcrRegs -[a0]
- lui a1, 0xD1FFAB1E
- addiw a1, a1, 0xD1FFAB1E
- and a0, a0, a1
- sext.w a0, a0
- sext.w a0, a0
- sltu a0, zero, a0
- ;; size=28 bbWeight=1 PerfScore 6.00
+ slli a0, a0, 52
+ srli a0, a0, 63
+ ;; size=12 bbWeight=1 PerfScore 3.00
G_M2731_IG03: ; bbWeight=1, epilog, nogc, extend
ld ra, 8(sp)
ld fp, 0(sp)
addi sp, sp, 16
ret ;; size=16 bbWeight=1 PerfScore 7.50
-; Total bytes of code 60, prolog size 16, PerfScore 22.50, instruction count 14, allocated bytes for code 60 (MethodHash=da9bf554) for method Microsoft.CodeAnalysis.VisualBasic.Symbols.SynthesizedDelegateMethodSymbol:get_IsOverridable():bool:this (FullOpts)
+; Total bytes of code 44, prolog size 16, PerfScore 19.50, instruction count 11, allocated bytes for code 44 (MethodHash=da9bf554) for method Microsoft.CodeAnalysis.VisualBasic.Symbols.SynthesizedDelegateMethodSymbol:get_IsOverridable():bool:this (FullOpts)
; ============================================================
Unwind Info:
@@ -47,7 +43,7 @@ Unwind Info:
E bit : 0
X bit : 0
Vers : 0
- Function Length : 15 (0x0000f) Actual length = 60 (0x00003c)
+ Function Length : 11 (0x0000b) Actual length = 44 (0x00002c)
---- Epilog scopes ----
---- Scope 0
Epilog Start Offset : 3523193630 (0xd1ffab1e) Actual offset = 3523193630 (0xd1ffab1e) Offset from main function begin = 3523193630 (0xd1ffab1e) -16 (-26.67%) : 249955.dasm - System.Text.RegularExpressions.Symbolic.MatchingState`1[byte]:IsNullableFor(uint):bool:this (FullOpts)@@ -25,20 +25,16 @@ G_M21865_IG02: ; bbWeight=1, gcrefRegs=0400 {a0}, byrefRegs=0000 {}, byre
; gcrRegs +[a0]
lw a0, 0xD1FFAB1E(a0)
; gcrRegs -[a0]
- addi a2, zero, 0xD1FFAB1E
- sllw a1, a2, a1
- and a0, a0, a1
- sext.w a0, a0
- sext.w a0, a0
- sltu a0, zero, a0
- ;; size=28 bbWeight=1 PerfScore 5.00
+ sraw a0, a0, a1
+ andi a0, a0, 1
+ ;; size=12 bbWeight=1 PerfScore 3.00
G_M21865_IG03: ; bbWeight=1, epilog, nogc, extend
ld ra, 8(sp)
ld fp, 0(sp)
addi sp, sp, 16
ret ;; size=16 bbWeight=1 PerfScore 7.50
-; Total bytes of code 60, prolog size 16, PerfScore 21.50, instruction count 15, allocated bytes for code 60 (MethodHash=569daa96) for method System.Text.RegularExpressions.Symbolic.MatchingState`1[byte]:IsNullableFor(uint):bool:this (FullOpts)
+; Total bytes of code 44, prolog size 16, PerfScore 19.50, instruction count 11, allocated bytes for code 44 (MethodHash=569daa96) for method System.Text.RegularExpressions.Symbolic.MatchingState`1[byte]:IsNullableFor(uint):bool:this (FullOpts)
; ============================================================
Unwind Info:
@@ -49,7 +45,7 @@ Unwind Info:
E bit : 0
X bit : 0
Vers : 0
- Function Length : 15 (0x0000f) Actual length = 60 (0x00003c)
+ Function Length : 11 (0x0000b) Actual length = 44 (0x00002c)
---- Epilog scopes ----
---- Scope 0
Epilog Start Offset : 3523193630 (0xd1ffab1e) Actual offset = 3523193630 (0xd1ffab1e) Offset from main function begin = 3523193630 (0xd1ffab1e) +4 (+9.09%) : 44466.dasm - Microsoft.CodeAnalysis.VisualBasic.Conversions:IsIdentityConversion(int):bool (FullOpts)@@ -20,17 +20,18 @@ G_M16914_IG01: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref,
mv fp, sp
;; size=16 bbWeight=1 PerfScore 9.00
G_M16914_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
+ not a0, a0
andi a0, a0, 5
- addiw a0, a0, 0xD1FFAB1E
+ sext.w a0, a0
sltiu a0, a0, 1
- ;; size=12 bbWeight=1 PerfScore 1.50
+ ;; size=16 bbWeight=1 PerfScore 2.00
G_M16914_IG03: ; bbWeight=1, epilog, nogc, extend
ld ra, 8(sp)
ld fp, 0(sp)
addi sp, sp, 16
ret ;; size=16 bbWeight=1 PerfScore 7.50
-; Total bytes of code 44, prolog size 16, PerfScore 18.00, instruction count 11, allocated bytes for code 44 (MethodHash=e888bded) for method Microsoft.CodeAnalysis.VisualBasic.Conversions:IsIdentityConversion(int):bool (FullOpts)
+; Total bytes of code 48, prolog size 16, PerfScore 18.50, instruction count 12, allocated bytes for code 48 (MethodHash=e888bded) for method Microsoft.CodeAnalysis.VisualBasic.Conversions:IsIdentityConversion(int):bool (FullOpts)
; ============================================================
Unwind Info:
@@ -41,7 +42,7 @@ Unwind Info:
E bit : 0
X bit : 0
Vers : 0
- Function Length : 11 (0x0000b) Actual length = 44 (0x00002c)
+ Function Length : 12 (0x0000c) Actual length = 48 (0x000030)
---- Epilog scopes ----
---- Scope 0
Epilog Start Offset : 3523193630 (0xd1ffab1e) Actual offset = 3523193630 (0xd1ffab1e) Offset from main function begin = 3523193630 (0xd1ffab1e) +4 (+8.33%) : 44364.dasm - Microsoft.CodeAnalysis.VisualBasic.Conversion:get_IsIdentity():bool:this (FullOpts)@@ -25,17 +25,18 @@ G_M36159_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0400 {a0}, byre
; byrRegs +[a0]
lw a0, 0xD1FFAB1E(a0)
; byrRegs -[a0]
+ not a0, a0
andi a0, a0, 5
- addiw a0, a0, 0xD1FFAB1E
+ sext.w a0, a0
sltiu a0, a0, 1
- ;; size=16 bbWeight=1 PerfScore 3.50
+ ;; size=20 bbWeight=1 PerfScore 4.00
G_M36159_IG03: ; bbWeight=1, epilog, nogc, extend
ld ra, 8(sp)
ld fp, 0(sp)
addi sp, sp, 16
ret ;; size=16 bbWeight=1 PerfScore 7.50
-; Total bytes of code 48, prolog size 16, PerfScore 20.00, instruction count 12, allocated bytes for code 48 (MethodHash=e6d072c0) for method Microsoft.CodeAnalysis.VisualBasic.Conversion:get_IsIdentity():bool:this (FullOpts)
+; Total bytes of code 52, prolog size 16, PerfScore 20.50, instruction count 13, allocated bytes for code 52 (MethodHash=e6d072c0) for method Microsoft.CodeAnalysis.VisualBasic.Conversion:get_IsIdentity():bool:this (FullOpts)
; ============================================================
Unwind Info:
@@ -46,7 +47,7 @@ Unwind Info:
E bit : 0
X bit : 0
Vers : 0
- Function Length : 12 (0x0000c) Actual length = 48 (0x000030)
+ Function Length : 13 (0x0000d) Actual length = 52 (0x000034)
---- Epilog scopes ----
---- Scope 0
Epilog Start Offset : 3523193630 (0xd1ffab1e) Actual offset = 3523193630 (0xd1ffab1e) Offset from main function begin = 3523193630 (0xd1ffab1e) +4 (+7.14%) : 21967.dasm - Microsoft.CodeAnalysis.VisualBasic.LookupOptionExtensions:IsAttributeTypeLookup(int):bool (FullOpts)@@ -20,20 +20,21 @@ G_M45402_IG01: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref,
mv fp, sp
;; size=16 bbWeight=1 PerfScore 9.00
G_M45402_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
+ not a0, a0
lui a1, 0xD1FFAB1E
addiw a1, a1, 0xD1FFAB1E
and a0, a0, a1
sext.w a0, a0
- subw a0, a0, a1
+ sext.w a0, a0
sltiu a0, a0, 1
- ;; size=24 bbWeight=1 PerfScore 4.00
+ ;; size=28 bbWeight=1 PerfScore 4.50
G_M45402_IG03: ; bbWeight=1, epilog, nogc, extend
ld ra, 8(sp)
ld fp, 0(sp)
addi sp, sp, 16
ret ;; size=16 bbWeight=1 PerfScore 7.50
-; Total bytes of code 56, prolog size 16, PerfScore 20.50, instruction count 13, allocated bytes for code 56 (MethodHash=e1cf4ea5) for method Microsoft.CodeAnalysis.VisualBasic.LookupOptionExtensions:IsAttributeTypeLookup(int):bool (FullOpts)
+; Total bytes of code 60, prolog size 16, PerfScore 21.00, instruction count 14, allocated bytes for code 60 (MethodHash=e1cf4ea5) for method Microsoft.CodeAnalysis.VisualBasic.LookupOptionExtensions:IsAttributeTypeLookup(int):bool (FullOpts)
; ============================================================
Unwind Info:
@@ -44,7 +45,7 @@ Unwind Info:
E bit : 0
X bit : 0
Vers : 0
- Function Length : 14 (0x0000e) Actual length = 56 (0x000038)
+ Function Length : 15 (0x0000f) Actual length = 60 (0x00003c)
---- Epilog scopes ----
---- Scope 0
Epilog Start Offset : 3523193630 (0xd1ffab1e) Actual offset = 3523193630 (0xd1ffab1e) Offset from main function begin = 3523193630 (0xd1ffab1e) DetailsSize improvements/regressions per collection
PerfScore improvements/regressions per collection
Context information
jit-analyze outputThe few +1 instruction regressions should disappear once we introduce some sign-extension elimination. |
RISC-V Release-FX-QEMU: 212796 / 214790 (99.07%)
report.xml, report.md, failures.xml, testclr_details.tar.zst Build information and commandsGIT: |
RISC-V Release-CLR-QEMU: 9060 / 9112 (99.43%)
report.xml, report.md, failures.xml, testclr_details.tar.zst Build information and commandsGIT: |
RISC-V Release-CLR-QEMU: 9082 / 9112 (99.67%)
report.xml, report.md, failures.xml, testclr_details.tar.zst RISC-V Release-CLR-VF2: 9083 / 9113 (99.67%)
report.xml, report.md, failures.xml, testclr_details.tar.zst RISC-V Release-FX-QEMU: 281996 / 283078 (99.62%)
report.xml, report.md, failures.xml, testclr_details.tar.zst RISC-V Release-FX-VF2: 510734 / 512476 (99.66%)
report.xml, report.md, failures.xml, testclr_details.tar.zst Build information and commandsGIT: |
@risc-vv /run |
RISC-V Release-CLR-QEMU: 9087 / 9117 (99.67%)
report.xml, report.md, failures.xml, testclr_details.tar.zst RISC-V Release-CLR-VF2: 9087 / 9117 (99.67%)
report.xml, report.md, failures.xml, testclr_details.tar.zst RISC-V Release-FX-QEMU: 279251 / 280326 (99.62%)
report.xml, report.md, failures.xml, testclr_details.tar.zst RISC-V Release-FX-VF2: 308242 / 309993 (99.44%)
report.xml, report.md, failures.xml, testclr_details.tar.zst Build information and commandsGIT: |
cc @jakobbotsch. |
Won't get to this until next week after .NET 10 snap. |
.NET 10 RC2 snap is Sept 22nd. We will review this after that. |
src/coreclr/jit/lower.cpp
Outdated
bitIndexOp->SetContained(); | ||
|
||
LIR::Use cmpUse; | ||
bool isUserJtrue = BlockRange().TryGetUse(cmp, &cmpUse) && cmpUse.User()->OperIs(GT_JTRUE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this kind of contextual checking accomplishing? It is almost always an anti pattern. Why is the optimization not done when JTRUE
is processed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess after #118270 a >= 0
saved to a register will lower to (a >> 63) ^ 1
so it's not needed anymore, changed to always take this optimization for constants.
In general, different set of instructions is available. A full set of comparisons on registers is fused with branches, but when saving the comparison result we only have less-than register or immediate. So knowing the context may be useful. What problems made checking for JTRUE an anti-pattern?
#ifndef TARGET_RISCV64 | ||
BlockRange().Remove(op1); | ||
BlockRange().Remove(op2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is RISC-V special here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no TEST_(EQ|NE)
instruction equivalent, it would have to be turned back into EQ|NE(AND(a, b), 0)
.
RISC-V pull_request-CLR-QEMU: 9112 / 9142 (99.67%)
report.xml, report.md, failures.xml, testclr_details.tar.zst RISC-V pull_request-CLR-VF2: 9112 / 9143 (99.66%)
report.xml, report.md, failures.xml, testclr_details.tar.zst RISC-V pull_request-FX-QEMU: 0 / 1 (0.00%)
report.xml, report.md, failures.xml, testclr_details.tar.zst RISC-V pull_request-FX-VF2: 0 / 106 (0.00%)
report.xml, report.md, failures.xml, testclr_details.tar.zst Build information and commandsGIT: |
Reuse and augment lowerings from mainstream platforms.
Base ISA only,
bext(i)
can be fitted easily in another PR once #115335 merges, it maps well toBITTEST_NE
.Part of #84834, cc @dotnet/samsung