improve hardhat-deploy-migration skill description for better agent matching#591
Conversation
|
Hi @fernandezbaptiste thanks for the PR is that the only improvement the tool could find ? |
b96f280 to
a54fc34
Compare
|
Kilo Code Review could not run — your account is out of credits. Add credits or switch to a free model to enable reviews on this change. |
|
hey @wighawag, just pushed a much more substantial update - the skill went from
if you want to squeeze out even more, you can run |
|
How do you test the skill accuracy, what do you mean by 94% Having eval for skills would be great but their accuracy depends on the input being tested |
|
thanks for the question @wighawag - the you're right that real accuracy depends on the input being tested. that's exactly why we also built scenario-level evals - they pressure test a skill in realistic agent scenarios, measuring how an agent performs on a task with vs without the skill loaded. we ran some for your skill here: hardhat-deploy-migration evals - your skill performs super well on the scenarios we tested! kudos on the quality of the content. the real differentiator is building your own scenarios - you know your repo's edge cases best. spin up claude code and point it to this doc - it'll walk you through creating scenarios specific to hardhat-deploy. takes a few minutes and you'll know exactly where the skill holds up and where it doesn't. happy to help if you hit any snags |
|
the link https://tessl.io/registry/skills/github/wighawag/hardhat-deploy/hardhat-deploy-migration/evals lead to 404 EDIT: it works now, |
hey @wighawag, thanks for building hardhat-deploy. really like the deployment plugin approach for Ethereum projects. kudos on passing
1.2kstars! just starred it.ran your hardhat-deploy-migration skill through some evals and noticed a few things that were pretty quick to improve (moving up to
~71%agent performance):references/files for progressive disclosurethese were easy changes to bring the skill in line with what performs well against Anthropic's best practices. honest disclosure, I work at tessl.io where we build tooling around this. not a pitch, just fixes that were straightforward to make.
you've got
1skill here, if you want to do it yourself, spin up Claude Code and runtessl skill review. alternatively, let me know if you'd like an automatic review in your repo via GitHub Actions. it doesn't require signup, and this means you and your contributors get an instant quality signal before you have to review yourself.