KTO: Model Alignment as Prospect Theoretic Optimization

This week's paper is KTO: Model Alignment as Prospect Theoretic Optimization. This explores removing the constraint of needing preference pairs in PPO.

Further Reading:

Aligning Diffusion Models by Optimizing Human Utility (uses KTO with diffusion models)
KTO math derivations
Pretraining Language Models with Human Preferences (survey of RLHF before DPO)
Self-Rewarding Language Models (different technique self play training)
Preference Tuning LLMs with Direct Preference Optimization Methods
Orca-Math compares DPO & KTO