-
Notifications
You must be signed in to change notification settings - Fork 91
Description
Hi OLMo team,
I'm currently working on converting a Hugging Face model (allenai/OLMoE-1B-7B-0924-Instruct) into OLMo/OLMoE's pretraining checkpoint format to resume pretraining. While I was able to convert the model weights, I encountered issues with missing metadata.json, which seems to be a critical component for loading the checkpoint in restore_checkpoint().
After examining the metadata.json generated when training OLMo from scratch, I realized that it contains non-trivial fields and additional metadata about the architecture and checkpoint format. Reconstructing this file seems error-prone, and I couldn't find documentation or scripts for safely performing this conversion.
Would it be possible for you to provide an official Hugging Face → OLMo checkpoint conversion script, or any guidance on the exact format required for metadata.json? This would be extremely helpful for me as I am working on a project that requires continual pretraining of the OLMoE- Instruct model using OLMo from existing HF checkpoints.
Alternately, it would also be helpful if you can share the OLMoE checkpoint that is accepted by the pretraining script (https://github.com/allenai/OLMo/blob/Muennighoff/MoE/scripts/train.py)
Thanks for your help!