Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement DivRem intrinsic for X86 #66551
Implement DivRem intrinsic for X86 #66551
Changes from 21 commits
00da1f4
d679cca
4eba5c5
6f8fbbf
997d8a5
c0c3dd6
c08cee7
1804350
46c3f78
2dbd2e9
1b6d09b
3ef3600
acaf211
728440d
4eace5d
3182ed8
48c12a4
be5b33c
d868194
ff65608
b31f896
0bd1327
51959c0
8f1d9a2
f4aece2
75dda8d
43049a9
d485b61
aeb114a
aa57f26
9345e57
65c0af6
2bfa332
7b41fd6
df922de
924fc42
1e7ff05
62e6c59
a1b4802
d547337
8199acb
ed12edb
532a8fd
281c1b0
97630a2
491f82e
7487dfc
7a862b2
0fc11c9
8ab6023
8d67890
0d85848
1fd2b74
18c9787
b63dea3
1b0e670
7631033
c0be00a
83192b1
b29e1b4
82c38a3
61fabe2
eea804c
ebe8781
9dc8522
0093b79
8260134
320270a
a98dbde
06f0460
1b3e851
9f192ac
537f678
ca32b24
e1b5fb3
6589700
2a90b4b
aad300a
eb3272c
39ae98d
f8473ec
e00a8c1
9e28279
2d4d8d9
b5e3dd3
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can delete this line, it should not be necessary now (but check we still set the flags in the constructor for the HWI node).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks somehow out of scope. I abstracted the common code path from arm64, but didn't meant to modify it. I don't have the environment to verify this change either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add back the assert
assert(HWIntrinsicInfo::IsMultiReg(intrinsic));
?Also, I think you should use the code what used to be here inside
impAssignMultiRegTypeToVar()
because this code looked much cleaner. E.g.:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SetMultiReg
andlvIsMultiRegRet
in fact doesn't do the same thing and replacing would cause failures.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SetMultiReg()
goes onGenTree
node whilelvIsMultiRegRet
goes onLclVarDsc
. You missed setting the flag onLclVarDsc
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This had an interesting side effect: if FeatureSIMD is turned off, X86Base.Pause will also be turned off.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider the following:
This logic will fail to reject the enregistration if
V00
(or, more accurately, its fields) even as it should (see also my earlier comment about passing the register count instead ofretTypeDesc
to fix this case).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed the logic of
CheckMultiRegLclVar()
the way you suggested. I am not fully sure about this example though. Can you give C# example where I can see things goes wrong?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not unlikely this cannot actually be reproduced right now because of the constrained way multi-reg locals are used, but here would be the idea for a reproduction:
Then V00 and V01 need to be promoted and forward subtitution needs to happen s.t. we end with:
The more general point though is that this can happen according to the IR validity rules and so needs to be handled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what this is doing, but not doing this will cause crossgen2 failing for
Utf8Formatter.TryFormat(DateTime)
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you try to find the root cause of this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't have time for this. I brought it from another PR mentioned earlier and it just works. It's better to ask the original author.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@CarolEidt can you help explaining this? Not doing this isn't causing failures now, but the code looks necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@huoyaoyuan - Carol doesn't work on this code base anymore. As the comment suggests, " If the source was another lclVar similarly promoted, we would have broken it into multiple stores.". So, I think you should revert this change, given that it works now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that was just a means to try which nodes are caught if
op1 != GT_CALL
. So the reason it is failing (as expected) is because the tree now contains theop1 == GT_HWINTRINSIC
.We do want to enregister in such cases and not doing it (retaining
op1 != GT_CALL
) spills the result on stack.https://www.diffchecker.com/TUY5sGjV
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes and no. My suggestion includes the (significant) addition that we will DNER all locals, not just promoted ones. It is to prevent the case I described above from arising:
struct retyped as long = multi-reg-node
and other assigns codegen doesn't support. Alternatively, we could of course support them in codegen, but that seems out of scope for this change.Yes, the independent promotion check (and the field count check as well) must stay. In fact, I think they should be expanded. Consider
struct { promoted field<double> } = HWI(int, int)
and other mismatched cases. Right now they would pass, but we need to DNER them, by passing the register count to the method instead of the return type descriptor.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's now new failures and I don't have enough time to investigate in more depth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will take a look hopefully next week.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the code as per @SingleAccretion suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this going to work as expected?
Normally the second operand to
BuildDelayFreeUses
is the rmw node. However, in this case we have two RMW nodes (op1 and op2) and so we should likely need to indicate thatop3
is delay free with regards to bothop1
andop2
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it remains
delayRegFree
for the entire treeNode. We resetpendingDelayFree
when processing the nexttreeNode
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, @kunalspathak, there is special logic that looks at
node
(op3
) compared tormwNode
(op1
)and a note specifically around:
So my guess is we either don't have sufficient tests covering this case or we aren't correctly tracking the return value as multi-reg. There is also has the later logic that says
if (node == rmwNode)
then it might not set the delay free.So, I think we need another overload of
BuildDelayFreeUses
that takes bothrmwNodes
and does the right general checks.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably worth including
(dstCout == 2) && (intrinsicId == DivRem)
?