-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Predicate cache alternative implementation #4292
base: main
Are you sure you want to change the base?
Conversation
7c80598
to
c9155d8
Compare
319f7a7
to
c5ad6e1
Compare
c5ad6e1
to
45bb216
Compare
@@ -1631,9 +1631,13 @@ DEF_OP(StoreMemPredicate) { | |||
DEF_OP(LoadMemPredicate) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense to introduce a new op here, or at least rename/fixup the description as appropriate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right ... done. To be honest, I am thinking that the work done to enable RA of predicate registers can be reverted. I see no reason to keep code in FEX that doesn't have any users and this is this was the only user of that code.
@Sonicadvance1 what do you think about reverting the predicate register RA?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, sounds reasonable to me.
|
||
if (FillP2) { | ||
ptrue(ARMEmitter::SubRegSize::i16Bit, PRED_X87_SVEOPT, ARMEmitter::PredicatePattern::SVE_VL5); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can just move this into the above if and drop the FillP2 check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SetPreds and FillP2 are not really equivalent. We only need to fill P2 if we really need the optimization enabled. This is not true in non-SVE cabable systems or if we don't have X87 ldst instructions in the block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mentioned this prior but perhaps you missed, the current behaviour is incorrect for some fills in the dispatcher as they will be emitted without it. Setting it in the above block (which is guarded by an sve check alone) is reasonable enough
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do I understand correctly that suggest I move it into the SetPredRegs
block? I have done that. I haven't dropped the FillP2 conditional. That is necessary to only generate this when strictly necessary.
783471e
to
ec18e10
Compare
Whenever the control float leaves the block, it might clobber the predicate register so we reset the cache whenever that happens. Fixes FEX-Emu#4264
ec18e10
to
d1fc46b
Compare
Alternative implementation to #4274
In draft mode as it's incomplete and broken.