能否支持qwen、gemma、mistral这些主流模型？ #6

ImmoCat-Git · 2025-03-20T04:22:00Z

另外，能否支持魔搭社区的ms-swift框架？
还有，我在尝试使用llamafactory上的apollo对qwen2.5-1.5b和3b 单卡微调时，发现显存占满，且十分缓慢。我的设备是1x3080ti 12g和1xtestla t10 16g。
而llamafactory的默认设置是{"optim": "adamw_torch"}，无法使用{"optim": "apollo_adamw"}。
能否对llamafactory提供更多支持？主要是README.md上的脚本使用说明能否更详细些，不知道怎么修改和调用apollo来进行q-apollo-mini单卡和多卡微调。
我希望能使用q-apollo-mini对qwen3b进行微调，感谢！

zhuhanqing · 2025-03-30T21:12:48Z

Hi, Apollo can be used for training different models. How to use APOLLO in llamafactory using can be found in example . You cannot directly use {"optim": "apollo_adamw"}, but you can set the arg to use apollo.

You can also try APOLLO using HF Transformers following the doc.

For ms-swift, we would like to consider integrating it when I have more bandwidth or their maintainer shows interest in working together. Thank you.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

能否支持qwen、gemma、mistral这些主流模型？ #6

能否支持qwen、gemma、mistral这些主流模型？ #6

ImmoCat-Git commented Mar 20, 2025

zhuhanqing commented Mar 30, 2025 •

edited

Loading

能否支持qwen、gemma、mistral这些主流模型？ #6

能否支持qwen、gemma、mistral这些主流模型？ #6

Comments

ImmoCat-Git commented Mar 20, 2025

zhuhanqing commented Mar 30, 2025 • edited Loading

zhuhanqing commented Mar 30, 2025 •

edited

Loading