Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Predicate cache alternative implementation #4292

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

pmatos
Copy link
Collaborator

@pmatos pmatos commented Jan 22, 2025

Alternative implementation to #4274

In draft mode as it's incomplete and broken.

@pmatos pmatos force-pushed the EnsurePredCacheReset2 branch 3 times, most recently from 7c80598 to c9155d8 Compare January 22, 2025 09:28
@pmatos pmatos force-pushed the EnsurePredCacheReset2 branch 4 times, most recently from 319f7a7 to c5ad6e1 Compare January 23, 2025 15:43
@pmatos pmatos marked this pull request as ready for review January 23, 2025 15:43
@pmatos pmatos force-pushed the EnsurePredCacheReset2 branch from c5ad6e1 to 45bb216 Compare January 23, 2025 17:31
@@ -1631,9 +1631,13 @@ DEF_OP(StoreMemPredicate) {
DEF_OP(LoadMemPredicate) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense to introduce a new op here, or at least rename/fixup the description as appropriate

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right ... done. To be honest, I am thinking that the work done to enable RA of predicate registers can be reverted. I see no reason to keep code in FEX that doesn't have any users and this is this was the only user of that code.

@Sonicadvance1 what do you think about reverting the predicate register RA?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, sounds reasonable to me.

FEXCore/Source/Interface/Core/ArchHelpers/Arm64Emitter.cpp Outdated Show resolved Hide resolved

if (FillP2) {
ptrue(ARMEmitter::SubRegSize::i16Bit, PRED_X87_SVEOPT, ARMEmitter::PredicatePattern::SVE_VL5);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can just move this into the above if and drop the FillP2 check

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SetPreds and FillP2 are not really equivalent. We only need to fill P2 if we really need the optimization enabled. This is not true in non-SVE cabable systems or if we don't have X87 ldst instructions in the block.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned this prior but perhaps you missed, the current behaviour is incorrect for some fills in the dispatcher as they will be emitted without it. Setting it in the above block (which is guarded by an sve check alone) is reasonable enough

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand correctly that suggest I move it into the SetPredRegs block? I have done that. I haven't dropped the FillP2 conditional. That is necessary to only generate this when strictly necessary.

@pmatos pmatos force-pushed the EnsurePredCacheReset2 branch 2 times, most recently from 783471e to ec18e10 Compare January 27, 2025 14:07
Whenever the control float leaves the block, it might clobber the
predicate register so we reset the cache whenever that happens.

Fixes FEX-Emu#4264
@pmatos pmatos force-pushed the EnsurePredCacheReset2 branch from ec18e10 to d1fc46b Compare January 27, 2025 14:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants