-
Notifications
You must be signed in to change notification settings - Fork 24
Open
Description
Hello! FlashDMoE is a very great piece of work!
As Gated MLP is widely used as FFN in many LLMs (like DeepSeekV3, Qwen3, Llama), is there a plan to support it?
Gated MLP based on swiGLU:
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels