Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DPO Fine-Tuning #73

Open
AnirudhJM24 opened this issue Oct 4, 2024 · 1 comment
Open

DPO Fine-Tuning #73

AnirudhJM24 opened this issue Oct 4, 2024 · 1 comment

Comments

@AnirudhJM24
Copy link
Contributor

The repository contains examples to fine-tune the model using Supervised Fine Tuning. I wish to add examples of Transformer Reinforcement Learning (TRL) particulary Direct Policy Optimization (DPO)

@ariG23498
Copy link
Collaborator

Hey @AnirudhJM24

I really like the idea, but would also ask you to share a rough colab notebook for this. I don't want a very complicated setup for SFT in the repository. Having said that, if you can showcase the workflow in a very simple way, I would be open to adding it.

Also do take a look at the /fine_tune directory for inspiration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants