Add Qwen3 support for Arm #145

xiangze-arm · 2025-09-02T09:24:38Z

Add qkRmsNorm for Arm to support Qwen3 models.

This is a stacked PR. Code changes for adding qkRmsNorm are in the third commit 480ec99.

- Implement flash attention for context attention - Implement flash decoding for decoder self attention - Avoid cache assem and use blocked kv cache directly - Compute GQA by group of heads in flash decoding Signed-off-by: Zhang Xiangze <[email protected]> Co-authored-by: Ruifeng Wang <[email protected]>

Add following functions for Arm Device: - moeFfnLayer - mlaContextAttention - mlaAbsorbAttention - layernormWithStride - mlaQKVGemm - slice - dispatch Upgrade torch version from 2.1.2 to 2.6.0 for Arm backend Add DeepSeek V2 lite support Add DeepSeek V3 support by packing FP8 weights to INT4 and compute with KleidiAI Improve performance of activation op with gate Add optimized MoE path for a8w4 Merge shared expert into moeFfnLayer for DeepSeek V3 Optimize flash decoding by split q dim for mla absorb attention Signed-off-by: Zhang Xiangze <[email protected]> Co-authored-by: Tianyu Li <[email protected]>

Signed-off-by: Zhang Xiangze <[email protected]>

netaddi · 2025-09-24T06:49:42Z

Hi @xiangze-arm, we have updated our development workflow, and we integrated our complete ci test pipeline into github pull request actions. could you please submit the pull requests again (maybe all-in-one), so that the pr auto triggers our ci
pipeline ?
thanks for you contribution !

xiangze-arm and others added 3 commits August 19, 2025 11:08

Add qkRmsNorm for Arm to support Qwen3 models

480ec99

Signed-off-by: Zhang Xiangze <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Qwen3 support for Arm #145

Add Qwen3 support for Arm #145

Uh oh!

xiangze-arm commented Sep 2, 2025

Uh oh!

netaddi commented Sep 24, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add Qwen3 support for Arm #145

Are you sure you want to change the base?

Add Qwen3 support for Arm #145

Uh oh!

Conversation

xiangze-arm commented Sep 2, 2025

Uh oh!

netaddi commented Sep 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

netaddi commented Sep 24, 2025 •

edited

Loading