Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

revamped val (bootstrap ci to support seq and tok intervals, many perplexity evals) #142

Merged
merged 10 commits into from
Dec 11, 2023

Conversation

sagadre
Copy link
Collaborator

@sagadre sagadre commented Dec 8, 2023

No description provided.

@sagadre sagadre changed the title modified val bootstrap ci to support seq and tok intervals revamped val (bootstrap ci to support seq and tok intervals, many perplexity evals) Dec 8, 2023
Copy link
Collaborator

@achalddave achalddave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple comments, not blocking

parser.add_argument("--eps", type=float, default=None, help="Adam epsilon.")
parser.add_argument("--lr", type=float, default=5.0e-4, help="Learning rate.")
parser.add_argument("--beta1", type=float, default=0.9, help="Adam beta 1.")
parser.add_argument("--beta2", type=float, default=0.95, help="Adam beta 2.")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're changing the default beta2 here compared to before right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah but beta2 of 0.95 has been pretty standard for most open_lm runs so far

"loss_upper_95": upper,
"loss_sequences_lower_95": lower_seq,
"loss_sequences_upper_95": upper_seq,
"loss_tokens_lower_95": lower_tok,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also save loss tokens mean?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given that we never see partial sequences i think this will just be loss

Copy link
Contributor

@Vaishaal Vaishaal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit but LGTM

open_lm/main.py Outdated

if is_master(args):
with fsspec.open(os.path.join(args.checkpoint_path, "results.jsonl"), "a") as f:
f.write(json.dumps(evaluation_metrics))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might be good to do this as 1 write, since we are doing this over fsspec, there could be a network error and we are left with a corrupt jsonl or something (without the newline)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as in do the dumps + \n at once so its idempotent and keeps jsonl consistent. In general append mode is scary

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sg

@sagadre sagadre merged commit 73e7b37 into main Dec 11, 2023
2 checks passed
@sagadre sagadre deleted the bootstrap_ci_pt2 branch December 11, 2023 18:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants