You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The code of inference demo (https://github.com/deepseek-ai/DeepSeek-V3.2-Exp/blob/main/inference/model.py#L563) shows that DSA of prefill stage uses MHA mode. However, in the technique report, the author claims that DSA is instantiated based on MQA mode. And in FlashMLA repo, the DSA for prefill stage only supports MQA mode. Is it a bug in the inference demo?