Skip to content

For experiments involving instruct gpt. Currently used for documenting open research questions.

License

Notifications You must be signed in to change notification settings

CarperAI/InstructGPT

Folders and files

NameName
Last commit message
Last commit date
Nov 8, 2022
Oct 10, 2022
Oct 10, 2022
Oct 11, 2022

Repository files navigation

BigModelName

This repository is for open-questions relating to RLHF and InstructGPT as pertaining to BigModelName.

Open Questions

  • What is the preference rate of PPO vs PPO-Ptx? Why was 27.8 chosen as the mixing factor between the pre-training gradients and the PPO gradients?
  • What do the gradient norms and gradient noise scales look like for PPO grads vs pre-training grads?
  • How important is SFT pretraining on human-written completions?

About

For experiments involving instruct gpt. Currently used for documenting open research questions.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published