Skip to content

[FlashAttention] Remove XeTLA for fwd mode #4524

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

whitneywhtsang
Copy link
Contributor

Since the results from XeTLA cannot be verified and we now have CUTLASS as a reference, which offers better performance, suggest to remove the XeTLA provider for flash attention forward mode.

@whitneywhtsang whitneywhtsang force-pushed the whitneywhtsang/flashattn_removexetla branch from 5d18616 to 89d12d9 Compare June 18, 2025 00:59
@whitneywhtsang whitneywhtsang force-pushed the whitneywhtsang/flashattn_removexetla branch 2 times, most recently from c84dfb9 to 03ae588 Compare June 18, 2025 04:08
Copy link
Contributor

@Egor-Krivov Egor-Krivov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jle-quel
Copy link
Contributor

XeTLA "ecosystem" also had a check_close function

This check_close function was added to check XeTLA results without asserting.
Since XeTLA was the only provider using it, should we remove it as well?

@whitneywhtsang whitneywhtsang force-pushed the whitneywhtsang/flashattn_removexetla branch from 106de50 to a7aab45 Compare June 18, 2025 14:37
@whitneywhtsang whitneywhtsang force-pushed the whitneywhtsang/flashattn_removexetla branch from a7aab45 to cd3ce37 Compare June 18, 2025 16:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants