Replace several Nullable.Value with .GetValueOrDefault() by stephentoub · Pull Request #22297 · dotnet/coreclr (original) (raw)
This repository was archived by the owner on Jan 23, 2023. It is now read-only.
Conversation8 Commits1 Checks0 Files changed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
The former does extra work that the latter doesn't do, and they're equivalent when we know it contains a value, such as immediately after a HasValue check.
@AndyAyersMS, I was a little surprised in some of these cases that the JIT isn't able to compile down to the same code for Value as it does for GetValueOrDefault. As an example, for this:
using System; using System.Runtime.CompilerServices;
class Program { static void Main() { Positive1(42); Positive2(42); }
[MethodImpl(MethodImplOptions.NoInlining)]
private static bool Positive1(int? i) => i.HasValue && i.Value > 0;
[MethodImpl(MethodImplOptions.NoInlining)]
private static bool Positive2(int? i) => i.HasValue && i.GetValueOrDefault() > 0;
}
I get this for Positive1 that uses Value:
G_M60389_IG01:
4883EC28 sub rsp, 40
90 nop
48894C2430 mov qword ptr [rsp+30H], rcx
G_M60389_IG02:
0FB6442430 movzx rax, byte ptr [rsp+30H]
85C0 test eax, eax
7414 je SHORT G_M60389_IG05
85C0 test eax, eax
7417 je SHORT G_M60389_IG07
G_M60389_IG03:
837C243400 cmp dword ptr [rsp+34H], 0
0F9FC0 setg al
0FB6C0 movzx rax, al
G_M60389_IG04:
4883C428 add rsp, 40
C3 ret
G_M60389_IG05:
33C0 xor eax, eax
G_M60389_IG06:
4883C428 add rsp, 40
C3 ret
G_M60389_IG07:
E815FFFFFF call ThrowHelper:ThrowInvalidOperationException_InvalidOperation_NoValue()
CC int3
and this for Positive2 that uses GetValueOrDefault:
G_M60387_IG01:
0F1F440000 nop
48894C2408 mov qword ptr [rsp+08H], rcx
G_M60387_IG02:
807C240800 cmp byte ptr [rsp+08H], 0
740C je SHORT G_M60387_IG04
837C240C00 cmp dword ptr [rsp+0CH], 0
0F9FC0 setg al
0FB6C0 movzx rax, al
G_M60387_IG03:
C3 ret
G_M60387_IG04:
33C0 xor eax, eax
G_M60387_IG05:
C3 ret
The former ends up doing the HasValue check twice, even though they happen one right after the other as part of Value getting inlined, and then I'd have expected it to be able to eliminate the dead branch that includes the ThrowHelper call.
Is it expected that it can't get this?
The former does extra work that the latter doesn't do, and they're equivalent when we know it contains a value, such as immediately after a HasValue check.
Haven't drilled in to confirm, but I suspect what you're seeing is that Nullable<int>
(or any small type T) is an unfortunate special case where the struct is small enough on x64 to be passed and returned by value, but has multiple fields.... at the ABI boundaries we retype this as a long
and then the consequent long<->struct "casts" in JIT IR block promotion and without this the jit can't optimize very well.
Similar issues seen over in #22079 with Range
. Does not seem as simple to unwedge as one might hope.
Might be amusing to look at x86 codegen here and see if it is better. Or try long?
.
Haven't drilled in to confirm, but I suspect what you're seeing is that Nullable (or any small type T) is an unfortunate special case where the struct is small enough on x64 to be passed and returned by value, but has multiple fields.... at the ABI boundaries we retype this as a long and then the consequent long<->struct "casts" in JIT IR block promotion and without this the jit can't optimize very well.
It looks like it's indeed related to struct handling. Though it can also be blamed on assertion propagation.
Due to the struct issues we end up with a GT_LCL_FLD
that's not so friendly to optimizations. CSE sort of gets rid of it but the way it does this, by creating a COMMA tree, causes problems later. We have something like:
( 15, 14) [000009] ------------ * STMT void (IL ???... ???)
N008 ( 15, 14) [000008] -A---------- \--* JTRUE void
N006 ( 1, 1) [000006] ------------ | /--* CNS_INT int 0 $40
N007 ( 13, 12) [000007] JA-----N---- \--* EQ int $180
N004 ( 3, 2) [000050] ------------ | /--* LCL_VAR int V02 cse0 $140
N005 ( 11, 10) [000051] -A---------- \--* COMMA int $140
N001 ( 4, 5) [000026] ------------ | /--* LCL_FLD bool V00 arg0 u:1[+0] Fseq[hasValue] $140
N003 ( 8, 8) [000049] -A------R--- \--* ASG int $VN.Void
N002 ( 3, 2) [000048] D------N---- \--* LCL_VAR int V02 cse0 $140
------------ BB02 [009..014) -> BB05 (cond), preds={BB01} succs={BB03,BB05}
***** BB02, stmt 2
( 7, 6) [000036] ------------ * STMT void (IL 0x009... ???)
N004 ( 7, 6) [000035] ----G------- \--* JTRUE void
N002 ( 1, 1) [000033] ------------ | /--* CNS_INT int 0 $40
N003 ( 5, 4) [000034] J---G--N---- \--* EQ int $180
N001 ( 3, 2) [000052] ------------ \--* LCL_VAR int V02 cse0 $140
and assertion propagation doesn't seem too happy to deal with a LCL_VAR
node hidden behind a COMMA
.
Ultimately I'd say "blame struct handling". Because it's not only that it causes other optimizations issues, it also forces the argument to memory. This:
mov qword ptr [rsp+08H], rcx
G_M60387_IG02: cmp byte ptr [rsp+08H], 0 je SHORT G_M60387_IG04
should be
G_M60387_IG02: test cl, cl je SHORT G_M60387_IG04
So we need to either make struct promotion somehow handle this or make it so that GT_LCL_FLD
doesn't force the variable into memory.
Yeah, that seems to be the case -- here LCL_FLD
is the thing that extracts the hasValue
field from the struct. Interestingly the jit does value number the two compares the same, so there's some hope that perhaps we could get at this even without promotion.
------------ BB01 [000..009) -> BB04 (cond), preds={} succs={BB02,BB04}
***** BB01, stmt 1
( 8, 9) [000009] ------------ * STMT void (IL ???... ???)
N004 ( 8, 9) [000008] ------------ \--* JTRUE void
N002 ( 1, 1) [000006] ------------ | /--* CNS_INT int 0 $40
N003 ( 6, 7) [000007] J------N---- \--* EQ int $180
N001 ( 4, 5) [000026] ------------ \--* LCL_FLD bool V00 arg0 u:1[+0] Fseq[hasValue] $140
------------ BB02 [009..014) -> BB05 (cond), preds={BB01} succs={BB03,BB05}
***** BB02, stmt 2
( 8, 9) [000036] ------------ * STMT void (IL 0x009... ???)
N004 ( 8, 9) [000035] ----G------- \--* JTRUE void
N002 ( 1, 1) [000033] ------------ | /--* CNS_INT int 0 $40
N003 ( 6, 7) [000034] J---G--N---- \--* EQ int $180
N001 ( 4, 5) [000031] ------------ \--* LCL_FLD bool V00 arg0 u:1[+0] Fseq[hasValue] $140
Even with long?
we don't get rid of this redundant branch (note we promote here and load the constituent parts into registers)... that is a bit surprising.
G_M51227_IG02: 0FB601 movzx rax, byte ptr [rcx] 488B5108 mov rdx, qword ptr [rcx+8] 84C0 test al, al 7412 je SHORT G_M51227_IG05 84C0 test al, al 7415 je SHORT G_M51227_IG07
Looks like JTRUE
assertion prop is blocked by the casts we insert as we widen the bool field to an int when we promote...
------------ BB02 [000..009) -> BB05 (cond), preds={BB01} succs={BB03,BB05}
***** BB02, stmt 2
( 8, 8) [000009] ------------ * STMT void (IL ???... ???)
N005 ( 8, 8) [000008] ------------ \--* JTRUE void
N003 ( 1, 1) [000006] ------------ | /--* CNS_INT int 0 $40
N004 ( 6, 6) [000007] J------N---- \--* EQ int <l:$283, c:$282>
N002 ( 4, 4) [000063] ------------ \--* CAST int <- bool <- int <l:$281, c:$280>
N001 ( 3, 2) [000028] ------------ \--* LCL_VAR int V02 tmp1 u:1 <l:$100, c:$140>
------------ BB03 [009..015) -> BB06 (cond), preds={BB02} succs={BB04,BB06}
***** BB03, stmt 3
( 8, 8) [000037] ------------ * STMT void (IL 0x009... ???)
N005 ( 8, 8) [000036] ----G------- \--* JTRUE void
N003 ( 1, 1) [000034] ------------ | /--* CNS_INT int 0 $40
N004 ( 6, 6) [000035] J---G--N---- \--* EQ int <l:$283, c:$282>
N002 ( 4, 4) [000064] ------------ \--* CAST int <- bool <- int <l:$281, c:$280>
N001 ( 3, 2) [000033] ------------ \--* LCL_VAR int V02 tmp1 u:1 (last use) <l:$100, c:$140>
There were only a few cases here, so I changed them. But there are a bunch in corefx, and I'm not currently planning on going through and doing all those... besides, having them there should give @AndyAyers and @mikedn something to verify a fix against :)
Ever since dotnet/roslyn#22800 shipped in 16.0, I've been preferring foo ?? default
or foo ?? 0
over foo.GetValueOrDefault()
since the codegen is the same. It's also a more consistent style if the default were to ever differ in another place.
picenka21 pushed a commit to picenka21/runtime that referenced this pull request
The former does extra work that the latter doesn't do, and they're equivalent when we know it contains a value, such as immediately after a HasValue check.
Commit migrated from dotnet/coreclr@1b2810b