Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

官方能否提供continue pretrain(增量预训练)的脚本呢? #166

Open
listwebit opened this issue Jan 26, 2024 · 1 comment
Open

Comments

@listwebit
Copy link

有几个问题请大佬指导一下:
1.官方能否提供continue pretrain(增量预训练)的脚步呢?
2.如果不能话,我想在领域数据上持续预训练,需要怎么做呢?将微调代码改一下?请大佬详细说一下,谢谢
3.如果用70B的模型持续增量预训练(非lora,全量参数更新),至少需要多少个机器呢?

感谢大佬的回复,祝愿大佬的大模型全球第一

@chentigerye
Copy link
Contributor

感谢支持,

  1. 建议参考我们近期paper, 里面详细介绍了预训练的一些细节:https://arxiv.org/abs/2312.08688
  2. 微调和与训练的代码不同,如果单机,可用我们开源的pretrain代码。train/train_clm.py;领域数据的建议是要混合通用数据
  3. 全参70b,6节点,48卡的a100-40g能跑。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants