Skip to content

Releases: lucidrains/PaLM-rlhf-pytorch

0.0.32

20 Dec 20:52
Compare
Choose a tag to compare
allow for fine tuning entire model without LoRA, project management

0.0.31

20 Dec 20:38
Compare
Choose a tag to compare
in case weight decay is helpful

0.0.30

20 Dec 19:11
Compare
Choose a tag to compare
add gradient clipping for actor critic

0.0.29

20 Dec 14:23
Compare
Choose a tag to compare
be able to generate multiple sequences at the end and pick the one wi…

0.0.28

20 Dec 01:25
Compare
Choose a tag to compare
make sure non-binned rewards model works

0.0.27

20 Dec 01:18
Compare
Choose a tag to compare
make sure non-binned rewards model works

0.0.26

20 Dec 01:16
Compare
Choose a tag to compare
more cleanup

0.0.25

20 Dec 01:08
Compare
Choose a tag to compare
its just plain ppo now

0.0.24

20 Dec 01:04
Compare
Choose a tag to compare
cleanup further

0.0.23

20 Dec 00:36
Compare
Choose a tag to compare
rename to ActorCritic and cleanup