You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The repository contains examples to fine-tune the model using Supervised Fine Tuning. I wish to add examples of Transformer Reinforcement Learning (TRL) particulary Direct Policy Optimization (DPO)
The text was updated successfully, but these errors were encountered:
I really like the idea, but would also ask you to share a rough colab notebook for this. I don't want a very complicated setup for SFT in the repository. Having said that, if you can showcase the workflow in a very simple way, I would be open to adding it.
Also do take a look at the /fine_tune directory for inspiration.
The repository contains examples to fine-tune the model using Supervised Fine Tuning. I wish to add examples of Transformer Reinforcement Learning (TRL) particulary Direct Policy Optimization (DPO)
The text was updated successfully, but these errors were encountered: