Added support for normal MLA kernel #17624

annanyapr · 2025-02-05T07:40:29Z

Have refactored _attention_prefill_ragged to allow for different v dimension from q/k dimension. This can be used for MLA attention in deepseek models.

annanyapr · 2025-02-05T07:42:47Z

@MasterJH5574 can you take a look?

annanyapr · 2025-02-20T04:18:30Z

@MasterJH5574 TVM seems to building correctly and tvm/tests/python/relax/test_runtime_builtin_paged_attention_kv_cache_tir.py seems to be working fine

MasterJH5574

LGTM, thanks! We are good to go after CI passes.

annanyapr force-pushed the generic-attention branch from 3eac670 to e3ac7b5 Compare February 10, 2025 15:03

annanyapr changed the title ~~Refactored code to allow for different v dimension from q/k dimension~~ Added support for normal MLA kernel Feb 17, 2025

annanyapr force-pushed the generic-attention branch from 506386d to 7569674 Compare February 17, 2025 20:38

annanyapr added 3 commits February 19, 2025 22:03

Refactored code to allow for different v dimension from q/k dimension

c2f0f86

Made a small fix after the rebase

7548bb6

Made changes to the runtime to support normal kernel

acd9fa0

annanyapr force-pushed the generic-attention branch from 7569674 to acd9fa0 Compare February 20, 2025 03:03

Fixed a compilation issue

ca86ca7

MasterJH5574 approved these changes Feb 20, 2025

View reviewed changes

MasterJH5574 force-pushed the generic-attention branch from 1e4b697 to a533a11 Compare February 20, 2025 15:52

Fix lint

bd88313

MasterJH5574 force-pushed the generic-attention branch from a533a11 to bd88313 Compare February 20, 2025 16:57

MasterJH5574 merged commit 6d92f2a into apache:main Feb 20, 2025
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added support for normal MLA kernel #17624

Added support for normal MLA kernel #17624

annanyapr commented Feb 5, 2025

annanyapr commented Feb 5, 2025

annanyapr commented Feb 20, 2025

MasterJH5574 left a comment

Added support for normal MLA kernel #17624

Added support for normal MLA kernel #17624

Conversation

annanyapr commented Feb 5, 2025

annanyapr commented Feb 5, 2025

annanyapr commented Feb 20, 2025

MasterJH5574 left a comment

Choose a reason for hiding this comment