Skip to content

generalroboticslab/Pref-GUIDE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pref-GUIDE: Continual Policy Learning from Real-Time Human Feedback via Preference-Based Rewards

Zhengran Ji¹, Boyuan Chen¹

¹ Duke University

Website | Paper | Video

Overview

Method

Result

Method

Quick Start

  1. Clone the repository:
git clone https://github.com/generalroboticslab/Pref-GUIDE.git
  1. Install the CREW platform, follow the instructions in CREW

  2. Activate the conda environment

conda activate crew
  1. Download the human feedback dataset from here, and extract it with the following
tar -xvzf RL_checkpoint.tar.gz
cd RL_checkpoint
python unzip_data.py
cd ../
  1. Process the dataset for reward model training
python process_data/process_data.py
  1. Train the preference-based reward model:
cd reward_model_training
bash train_model.sh
  1. Train the RL Agent with the reward model:
cd CREW/crew-algorithms
bash ddpg.sh
  1. Evaluate the trained RL Agent
cd CREW/crew-algorithms
bash ddpg_eval.sh

Acknowledgement

This work is supported by the ARL STRONG program under awards W911NF2320182, W911NF2220113, and W911NF2420215, and by gift supports from BMW and OpenAI. We also thank Lingyu Zhang for helpful discussion.

Citation

If you think this paper is helpful, please consider citing our work

@misc{ji2025prefguidecontinualpolicylearning,
      title={Pref-GUIDE: Continual Policy Learning from Real-Time Human Feedback via Preference-Based Learning}, 
      author={Zhengran Ji and Boyuan Chen},
      year={2025},
      eprint={2508.07126},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2508.07126}, 
}      

About

This is the open-sourced implementation of Pref-GUIDE: Continual Policy Learning from Real-Time Human Feedback via Preference-Based Learning.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published