-
Notifications
You must be signed in to change notification settings - Fork 34
liger-kernel version mismatch #91
Copy link
Copy link
Open
Description
I ran into a problem when training the qwen3-dlm model. The function interface is incompatible. In liger-kernel 0.6.2, the forward propagation of LigerFusedLinearCrossEntropyFunction has 12 parameters, but in qwen3_dlm, the function is passed 13 parameters, which actually corresponds to the parameter format of liger-kernel 0.6.3. Same problem in llada_dlm and dream_dlm.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels