Skip to content

Conversation

henrylhtsang
Copy link

Hi, trying to a review and some suggestions for further perf improvements.

tldr: Adding sliding window support. We can now pass in window_size_left and window_size_right to Example 77 FMHA.

For the most part, the implementation is pretty standard. In forward, when handling the mask -> unmask -> mask pattern, we adopt the dynamic dispatch mechanism, copied from the backward masking dynamic dispatch mechanism, instead of a 2-loop / 3-loop mechanism.

Features:

  • Sliding window support for fwd and bwd for fmha.
  • Added some unit tests.

Does not support:

  • does not support gen case or MLA
  • doesn't support cp.async

@henrylhtsang
Copy link
Author

maybe @hwu36? Would love to get some feedback

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant