two-stage fine-tuning with LoRa #2791

tynyanovnik · 2024-12-27T11:04:34Z

Hello,
We want to perform two-stage fine-tuning with LoRa (SFT, then DPO). We have the first LoRA adapter after SFT, and we want to obtain a second adapter after DPO. Is there a way to continue training the same adapter for DPO using swift, without merging the original model weights with the adapter after SFT?

The text was updated successfully, but these errors were encountered:

Jintao-Huang · 2024-12-28T13:42:42Z

When using DPO, directly train with resume_from_checkpoint.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

two-stage fine-tuning with LoRa #2791

two-stage fine-tuning with LoRa #2791

tynyanovnik commented Dec 27, 2024

Jintao-Huang commented Dec 28, 2024

two-stage fine-tuning with LoRa #2791

two-stage fine-tuning with LoRa #2791

Comments

tynyanovnik commented Dec 27, 2024

Jintao-Huang commented Dec 28, 2024