-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Description
There are several confusing model naming schemes. The most confusing is currently for weight decay.
-
$\beta = 0$ : TinyStories-01x0064_01n -
$\beta = 0.0005$ : TinyStories-01x0064_01d -
$\beta = 0.1$ : TinyStories-01x0064_01L
This is inconsistent across different
We will likely have to write a script that auto-clones all of the models to update them. Then we'll have to correct our experimental scripts and artifacts themselves.
Metadata
Metadata
Assignees
Labels
No labels