Skip to content

feat(evals): add Braintrust evals package#1

Open
barryroodt wants to merge 1 commit into
mainfrom
feat/braintrust-evals
Open

feat(evals): add Braintrust evals package#1
barryroodt wants to merge 1 commit into
mainfrom
feat/braintrust-evals

Conversation

@barryroodt
Copy link
Copy Markdown
Owner

Summary

  • New evals/ package: separate npm workspace with 5 JS scorers for refine-skill output quality
  • Env-knob model swap (REFINE_EVAL_MODEL) + dryrun mode (REFINE_EVAL_DRYRUN=1) so you can smoke-test without burning API credits
  • NEXT_STEPS.md updated with install, free-model validation, fixture-growth, and future CI-gate todos

Test plan

  • `cd evals && npm install`
  • `REFINE_EVAL_DRYRUN=1 npm run eval:dryrun` — confirm scorers wire up and run without API key
  • With `BRAINTRUST_API_KEY` + `REFINE_EVAL_MODEL=gemini-2.5-flash`: `npm run eval` — confirm push to Braintrust

🤖 Generated with Claude Code

Separate npm package under evals/ with 5 scorers and dryrun mode.
Env-knob model swap via REFINE_EVAL_MODEL; defaults to local dryrun.
NEXT_STEPS.md updated with install/validation/CI-gate todos.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant