rlhf_dpo

Jump to bottom

陳鍾誠 edited this page Dec 31, 2024 · 2 revisions

從 RLHF 到 DPO

陳鍾誠於金門大學資訊工程系 -- 本書衍生自維基百科與 Karpathy 的 micrograd 與 minGPT ，採用 CC: BY-SA 授權

Clone this wiki locally