Skip to content
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
e34e91b
Add semantic-layer bundled skill
escherize Jun 1, 2026
3681c8c
Add data-transformation bundled skill
escherize Jun 1, 2026
580084d
Add robot-data-engineer parent router skill (working title)
escherize Jun 1, 2026
fabd4eb
Update data-transformation skill from upstream gist
escherize Jun 1, 2026
748e062
Update data-transformation skill from upstream gist
escherize Jun 1, 2026
8cb6ef9
Tweaks to data-transformation skill
galdre Jun 1, 2026
ae1d025
Disambiguate skill altitude in descriptions
escherize Jun 1, 2026
83d7834
Add data-analysis skill; route to it from robot-data-engineer
escherize Jun 1, 2026
3a6494d
Add shared contract to router; point children at it
escherize Jun 2, 2026
964b272
Robot Data Analysts should give more context
galdre Jun 2, 2026
152bc54
Harden robot-data-engineer skills from demo feedback
escherize Jun 2, 2026
6899934
Adopt skillsaw; clear all 17 lint warnings
escherize Jun 2, 2026
9883144
Dedup question-context rule in router Shared Contract
escherize Jun 2, 2026
7fc2c63
Add lint:skills script (skillsaw via uvx)
escherize Jun 2, 2026
6bccd2f
remove overfitting, eg fivetran mention
escherize Jun 2, 2026
2a56836
Plan mode in data-transformation
galdre Jun 2, 2026
5f2a391
m.
galdre Jun 2, 2026
2af86a2
More review feedback.
galdre Jun 2, 2026
2e1bc02
Last piece of feedback
galdre Jun 2, 2026
15edc3a
Manual shrinking of data-transformation skill
galdre Jun 2, 2026
83282c2
Possible pretty-print transform fix
galdre Jun 2, 2026
830f6ab
Promote robot-data-engineer as the front-door entrypoint
escherize Jun 2, 2026
e420a65
Clear skillsaw warnings: bump skill token limit, drop hedge
escherize Jun 2, 2026
9c37035
Release 0.1.11
escherize Jun 2, 2026
89f23be
Lead marketplace plugin description with the data-analyst journey
escherize Jun 2, 2026
8e687ba
Drop generic 'analyze X' trigger from plugin description
escherize Jun 2, 2026
dd75b07
Fix skills e2e: add data-analysis to bundled skill list
escherize Jun 3, 2026
a181d3e
Tighten robot-data-engineer scope
galdre Jun 3, 2026
b38bf25
Fix card e2e: tolerate version-dependent bad-Database-ID error
escherize Jun 3, 2026
4657be2
Fix card update e2e: tolerate version-dependent PUT validation
escherize Jun 3, 2026
58fa27a
format test files
escherize Jun 3, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 9 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -1338,12 +1338,15 @@ mb skills path core # one path

Bundled skills:

| Name | Use |
| ----------- | -------------------------------------------------------------------------------------- |
| `core` | Top-level guide: auth, flag conventions, output flags, body input, every command group |
| `transform` | Authoring and running transforms (native SQL + MBQL 5), iteration, run inspection |
| `document` | Authoring document bodies: the TipTap JSON tree, embedding cards, entity links |
| `git-sync` | Round-tripping Metabase content to/from a git remote |
| Name | Use |
| --------------------- | --------------------------------------------------------------------------------------------- |
| `core` | Top-level guide: auth, flag conventions, output flags, body input, every command group |
| `transform` | Authoring and running transforms (native SQL + MBQL 5), iteration, run inspection |
| `data-transformation` | Raw, normalized source database → clean, wide, analysis-ready tables for a non-technical user |
| `semantic-layer` | Turning clean tables into reusable segments, measures, and metrics for a non-technical user |
| `robot-data-engineer` | Front-door router for the whole journey (raw → tables → definitions → dashboards); name TBD |
| `document` | Authoring document bodies: the TipTap JSON tree, embedding cards, entity links |
| `git-sync` | Round-tripping Metabase content to/from a git remote |

Discovery surfaces:

Expand Down
3 changes: 3 additions & 0 deletions skill-data/core/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -150,6 +150,9 @@ This core file is enough for any single-command task. Load the relevant skill **
- **`mbql`** — authoring or fixing any MBQL query body: `mb query`, a card `dataset_query`, a transform `source.query`, a measure/segment `definition`, "aggregate and group by", reading `--dry-run` errors. The query-body reference.
- **`viz`** — choosing a card's `display` and authoring `visualization_settings`: "make it a bar chart", "set the pie dimension/metric", "format this column as currency", "the card renders as a table instead of a chart". The presentation counterpart to `mbql`.
- **`transform`** — "create a transform", "run a transform", authoring transform body JSON, run inspection.
- **`data-transformation`** — the higher-level workflow: turning a raw, normalized source database into a small set of clean, wide, analysis-ready tables for a non-technical user — "clean up", "flatten", "denormalize", "make sense of this database", "build analysis-ready tables". Wraps `transform` (the mechanics) with the investigate → propose → build flow.
- **`semantic-layer`** — turning clean tables into reusable definitions: "make this filter reusable", "define active customers / net revenue / MRR officially", "create a segment / measure / metric", "so everyone uses the same definition". Builds on `mbql` (the definition bodies) and `transform` (widen a table first when a definition needs more than one).
- **`robot-data-engineer`** — the front-door router for the whole journey (raw data → clean tables → reusable definitions → dashboards) for a non-technical user: "make sense of my data", "build a data model", "go from raw data to a dashboard", "be my data analyst". Detects where the user is and routes to `data-transformation` / `semantic-layer` / `visualization`. (Working title — name TBD.)
- **`git-sync`** — "import the latest changes", "export to git", "git sync", "dirty check", "stash before pulling".

If a task spans more than one, load each. Specialized skills assume the conventions above and won't repeat them. `mb skills list` enumerates everything on the installed version.
Expand Down
63 changes: 63 additions & 0 deletions skill-data/data-analysis/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
---
name: data-analysis
description: Answer real questions from clean, analysis-ready tables and hand back a plain-language report — not a chart-building task, an answer-finding one. Claude reads the available tables, turns the user's question ("who registered", "what did people say they want", "which option won") into queries, runs them against the live instance, sanity-checks the numbers, and writes up findings the user can read and trust. Works over tables that are already clean (wide, human-readable) — survey/registration answers, event signups, customer lists, anything where the user has a question and the data already holds the answer. Use whenever someone wants to "answer questions about my data", "report on who registered / signed up / responded", "what did people say", "analyze X", "explore this data", "find patterns", "summarize the responses", or "build me a report". For a non-technical user who knows their domain. This is the strategy skill for investigating clean data and reporting findings; if the question needs charts/dashboards instead of a written answer, use the `visualization` skill; if the tables are still raw and messy, use `data-transformation` first.
allowed-tools: Read, Write, Edit, Bash, AskUserQuestion
---

# Data Analysis

The user has a question and clean data that already holds the answer. Your job: find the answer, check it's right, and hand it back in plain language. You're an analyst, not a dashboard builder — the deliverable is a **trustworthy written answer**, optionally backed by a saved question they can re-open.

This skill assumes the tables are already clean (wide, human-readable). If they're raw and normalized — lots of `*_field`/`*_choice` lookups, coded columns, JSON blobs — stop and route to `data-transformation` first; don't analyze on top of a mess.

---

## The loop

For each question the user asks:

1. **Find where the answer lives.** List tables (`mb table list`, `mb db schema-tables <db> <schema>`). Read the columns (`mb table fields <id>`). Clean datasets often ship the same facts two ways — a **wide** table (one row per thing, easy to read) and a **long** table (one row per attribute, easy to aggregate over many-valued answers). Pick the one that fits the question: per-person facts → wide; "which option was most popular" across a multi-select → long.

2. **Turn the question into a query.** Write it, run it (`mb query`). Start small — a `count(*)` and a couple of sample rows to confirm you're pointed at the right table and the columns mean what you think. Then write the real query.

3. **Sanity-check before you believe it.** A number with no cross-check is a guess. Confirm row counts against a total you trust, watch for nulls/blanks inflating or deflating a percentage, and re-read the column you grouped on — a `type/Category` column with "confirmed"/"cancelled" means your "how many registered" answer depends on which statuses you counted. State the denominator.

4. **Report in plain language.** Lead with the answer, then how you got it. Numbers get context ("9 of 10 confirmed"), not bare figures. For free-text answers, quote a few real responses rather than only counting them — the words are the value.

---

## What to ask the user up front

Don't over-interrogate, but settle the things that change the answer:

- **Scope.** All-time or a window? Everyone, or only confirmed/active? A "how many registered" with no status filter and a "how many *confirmed*" are different numbers — pick the one they mean, and say which you used.
- **Cut.** Do they want the headline number, or the number broken down (by role, by company, by version)? A breakdown is usually one `GROUP BY` away and far more useful.
- **Form of the answer.** A number in chat? A short written digest? A saved question they can re-open and refilter? If they want something durable or visual, that's the `visualization` skill — hand off.

When genuinely unsure which interpretation they mean, ask — never silently pick one and present it as the answer.

---

## Survey / registration data — the common shape

A lot of "analyze who registered / what did people say" work lands on event or survey data, which has a recognizable shape worth calling out:

- A **per-registrant wide table** — name, company, role, status, plus one column per single-answer question. Use it for "who registered", rosters, breakdowns by role/version/company, and any per-person filter.
- A **long answers table** — one row per (registrant, question, answer). Use it for **multi-select** questions (one person picks several options, so they can't flatten into one wide column) and for "which option was chosen most". Group by the question text, then by the answer value.
- **Question definitions** — the catalog of what was asked, the answer choices, free-text vs single vs multi. Read this first to know which questions exist and how each is typed before you start counting.

Three report families cover most asks:

1. **Roster** — who registered, with the facts that matter (company, role, status). A filtered, ordered read of the wide table.
2. **Distribution** — how the group splits on a single-select (role, version, customer-or-not). A `GROUP BY` with counts; the agent-facing answer is "X% picked A, Y% picked B".
3. **Open-ended digest** — what people said in free-text ("what do you want to learn / teach / discuss"). Small N usually — list the actual answers, don't just count them; the responses are the point.

---

## Don't

- **Don't analyze raw, un-cleaned tables.** If the data is normalized/coded/JSON, route to `data-transformation` first and analyze the clean output.
- **Don't report a number you didn't sanity-check.** No denominator, no null-check → no answer.
- **Don't silently pick a scope.** "Registered" vs "confirmed", all-time vs window — state which you used, or ask.
- **Don't build charts/dashboards here.** A written answer (and maybe one saved question) is the deliverable; if they want it visual, that's `visualization`.
- **Don't only count free-text.** Quote the real responses — the words carry the insight a count throws away.
Loading
Loading