Should we decide tail folding by constructing additional VPlans?

This discussion came up during the vectorizer improvements call. 

We currently create VPlans for VF=1, fixed VFs, scalable VFs and VF=vscale x 1 on sve.

Whether or not these VPlans are tail folded is determined by `TTI->preferPredicateOverEpilogue` or the `-prefer-predicate-over-epilogue` flag. 

Instead of using a hook, we could creating a new VPlan with tail folding and let the cost model decide whether or not to select it based on profitability.

We could probably also consider all the different tail folding styles, but to keep the number of VPlans reasonable we could begin by leaving that to TTI.

So e.g. on RISC-V, we would consider VF=1, VF=fixed, VF=scalable and VF=scalable + tail folding. The proposed default EVL tail folding style isn't compatible with fixed VFs.

On AArch64, we would probably have more VPlans since its tail folding style is supported by both fixed + scalable VFs IIUC. 

One significant benefit for this would be that we be able to fall back to non-tail folded loops for scenarios that aren't fully supported with tail folding e.g. interleaved groups on RISC-V. But we would also need to consider the fact that non-tail folded loops may not always have their vectorized body run due to the minimum trip count, and account for that in the cost.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Should we decide tail folding by constructing additional VPlans? #148882

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Should we decide tail folding by constructing additional VPlans? #148882

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions