Robot data engineer skills (parent + transforms + semantic-layer + viz)#12
Merged
Conversation
Add a semantic-layer skill that turns clean, analysis-ready tables into reusable Metabase segments (saved filters), measures (saved aggregations), and metrics (official numbers) for a non-technical domain user. - New skill-data/semantic-layer/SKILL.md (auto-discovered by the skill loader) - Cross-reference it from the core skill's specialized-skills list - Document it in the README bundled-skills table - Add it to the e2e bundled-skill golden list (now seven)
Add the higher-level data-transformation workflow skill: raw, normalized source database -> a small set of clean, wide, analysis-ready Metabase transforms, for a non-technical domain user. Wraps the mechanical transform skill with an investigate -> propose -> build -> verify flow. - New skill-data/data-transformation/SKILL.md (auto-discovered) - Cross-reference from the core skill's specialized-skills list - README bundled-skills table - e2e golden list (seven -> eight) Co-authored-by: Timothy Dean <7650347+galdre@users.noreply.github.com>
Contributor
Author
|
TODO before merge: revisit naming. Parent skill working title is |
Add the front-door router for the robot-data-scientist journey: a light wrapper that detects where the user is (raw data / clean tables / ready to chart), sets up auth + the autonomy slider once, then routes to the specialized child skill (data-transformation / semantic-layer / visualization) and hands off. Stays small by design — it dispatches, it doesn't do the work. Parent owns only the end-of-journey hard stop; children self-manage their in-stage gates. Name is a working title (robot-data-engineer), TBD before merge. - New skill-data/robot-data-engineer/SKILL.md (auto-discovered) - Cross-reference from the core skill's specialized-skills list - README bundled-skills table - e2e golden list (eight -> nine)
Sync Timothy's latest revision: two new hard rules (confirm non-obvious business rules in plain terms before baking them in; flag sensitive personal data rather than silently carrying it), a sensitive-data prudential call, and expanded guidance on decoding, soft-delete filtering, writing table/column descriptions back to Metabase, and one-pass encoding normalization. Co-authored-by: Timothy Dean <7650347+galdre@users.noreply.github.com>
Sync Timothy's latest revision: a new hard rule against overwriting an existing table or another transform's output (check the target name is free first), table-name agreement in the iterate phase (propose + confirm free before building), and a new cleaning checklist section whose governing rule is surface-what-you-find rather than silently fix it. Co-authored-by: Timothy Dean <7650347+galdre@users.noreply.github.com>
Add a strategy-vs-mechanics carve-out to the trigger clause of the two strategy skills so the model picks the right altitude: - data-transformation: points single-transform work at the transform skill - semantic-layer: points raw segment/measure command mechanics at core Mirror transform's existing downward ref to core with an upward breadcrumb to data-transformation in its body.
New data-analysis sub-skill covers the fourth journey stage: answering real questions from clean tables and handing back a written report (distinct from charting, which stays in visualization). Wire it into the robot-data-engineer router's description, journey list, and route table. Also fixes a latent parse bug in the router frontmatter: an unquoted "light router: it works" made the YAML parser read the description as a mapping, so parseFrontmatter returned null and discoverSkills silently dropped the skill -- robot-data-engineer never appeared in `mb skills list`. Reworded the colon to an em-dash.
ignacio-mb
approved these changes
Jun 2, 2026
ignacio-mb
reviewed
Jun 2, 2026
ignacio-mb
reviewed
Jun 2, 2026
ignacio-mb
reviewed
Jun 2, 2026
Hoist the cross-cutting rules every child skill must follow into a single Shared Contract section in robot-data-engineer: audience, jargon list (avoid normalize/grain; ERD/foreign key fine; explain wide/long on first use), PII handling (ask before showing rows; default to aggregates), capability limits (name what the CLI can't do instead of erroring into raw SQL), the autonomy slider, and the final hard stop. Each child (data-analysis, data-transformation, semantic-layer, visualization) gets a top-of-file up-pointer: a one-line summary plus an instruction to load the router's Shared Contract. The summary stands on its own so a directly-invoked child still gets the gist if the pointer is skipped. Drop the duplicated autonomy-slider prompt from semantic-layer, keeping only its stage-specific application of the modes.
escherize
commented
Jun 2, 2026
Comment on lines
+67
to
+69
| > - I found a mismatch in ... | ||
| > - This matters because ... | ||
| > - Here's what I was thinking, but I need to check ... |
Add three cross-cutting rules to the router's Shared Contract, drawn from two live demo runs (Swoogo, Luma): - Permission-denied discipline: on a denied query, stop -- never silently substitute a different readable table and pass its numbers off as the answer (the incident where an Account-table question got answered with Salesforce data). Diagnose the likely cause in plain terms, offer to search for a readable look-alike, surface any match as a confirm question, and hand control back -- no GRANT statements, no profile-switching Claude can't reliably execute. - Scratch files go in ./.scratch, never /tmp (better perms, persists, user-reviewable). Swept the /tmp examples in core, transform, document, and mbql to match. - Talking to the user: don't reference things they never saw, assume they read only the last ~30 lines, give questions full context, keep permission requests to one plain sentence. Rework the router's discovery section to ask the user where the data lives before crawling (asymmetry: name a db -> ask the schema; name a table -> ask the db), give the efficient command ladder, and offer a sync when a table is missing. De-duplicate auth: core's Auth & profiles section is the single source; the router keeps one line (it's the front door, may run before core loads) and data-transformation defers to core.
Wire skillsaw (uvx) as a deterministic linter for the skill collection and clear every warning it reported: - Content quality: reword two weak-language hedges (ideally/correctly) to concrete behavior; flip the two negative-only "Don't" items (mbql, robot-data-engineer) to lead with the positive action. - Descriptions: compress the four over-long ones (robot-data-engineer, mbql, semantic-layer, data-analysis) under the 1024-char / 200-token limits, keeping the distinctive trigger phrases and dropping only redundant ones. No unquoted colon-space (would break frontmatter parse). - Bodies: a précis pass over the seven over-budget skills -- cut restated lead-ins, filler transitions, emphasis padding, and prose that merely restated an adjacent code block. Every rule, command, footgun, and worked example is kept; the dense skills were already mostly substance. Add .skillsaw.yaml pinned to 0.11.4 with an honest token ceiling (skill.warn 5100 -- above the largest leaf skill's de-fluffed floor, still catching real future bloat) and skill-description.warn 200. Add a strict skillsaw job to the Lint workflow.
Timothy's "Robot Data Analysts should give more context" (964b272) added a "Questions must carry their own context" paragraph that overlapped a bullet I'd added in the same Shared Contract. Keep his fuller version (it carries the recap template) as canonical, drop the redundant bullet, and point to it from the "Talking to the user" list so the rule lives in one place.
The whole-journey router was buried at the bottom of core's
specialized-skills list, ranked as a peer of git-sync/mbql, and the
autoloaded discovery stub only pointed at core. An outcome-seeking
user ("make sense of my data", "build a dashboard") had no direct path
to the router that's meant to run first.
- Stub: add a journey-intent fast path straight to
`mb skills get robot-data-engineer` before loading the dense core ref.
- Core: hoist robot-data-engineer to the top of the list with a
"start here for anything bigger than one command" lead-in, add
data-analysis to its routing targets, drop the "name TBD" marker.
- README: drop "name TBD" from the bundled-skills table.
- context-budget warn 5100 -> 6000 (data-transformation's honest floor grew to ~5,805 tokens). - Reword 'flag it when appropriate' -> 'flag it on sight' to drop the hedging the content-weak-language rule flags.
Ships the robot-data-engineer entrypoint promotion. release.yml auto-publishes on push to main only when package.json's version is not yet on npm, so the bump is required for the skill changes to reach installed CLIs via mb skills get.
An unaware user describes a goal ('make sense of my data', 'be my
data analyst'), and Claude matches it against the plugin description
to decide relevance. The old description was CRUD/CLI-only, so a
journey-shaped request matched nothing. Lead with the journey trigger
phrases (mirrored from robot-data-engineer); keep CRUD + git-sync as
the second half.
'analyze X' over-triggers — it matches any analysis request (logs,
code, a CSV, an image), not just Metabase data work. The remaining
data-anchored phrases ('make sense of my data', 'answer questions
about my data', 'report on who registered', 'set up analytics for X')
already cover the intent without the false positives.
The data-analysis skill was added to skill-data/ but the e2e test's BUNDLED_VISIBLE_NAMES still listed nine skills, so list/path/get-all assertions and the unknown-skill 'available' message failed across all E2E matrix lanes.
v58-61 leak the app-DB constraint (NULL not allowed for column "DATABASE_ID"); head validates at the query layer first (missing or invalid Database ID). Accept either exact substring.
head validates dataset_query at the query layer (exit 1, missing or invalid Database ID); v58-61 accept it as an opaque map (exit 0). Assert the pre-flight bypass instead of a fixed server outcome.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Builds in 4 new skills:
robot-data-engineer-> the parent skill, that guides you through an e2e workflowdata-transformationRaw data -> clean tablessemantic-layerClean tables -> reusable definitionsdata-analysisClean tables -> answers and reportsAdds a skill quality linter
Makes sure the files dont get too long, runs on CI.
Below is AI generated:
Bundles the robot data scientist skill suite into the CLI: a non-technical domain user, driving
mbthrough Claude, goes from raw data → analysis-ready tables → a shared semantic vocabulary → dashboards, without leaving the conversation.Four skills, one parent + three children:
mb robot-data-engineer?)data-transformationskill; partly covered by the existing bundledtransformskill)Building all four on this branch; merging once the suite is coherent.
semantic-layer (done)
Why
The CLI already ships a
transformskill (raw → clean wide tables). The next step — turning those tables into a shared vocabulary the org reuses — had no skill.The vocabulary maps to three Metabase features the CLI already has verbs for:
mb segment create) — a saved filter ("active customers")mb measure create) — a saved aggregation ("net revenue")mb card create,type: metric) — an official, collection-living number ("MRR")Approach
Modeled on the
data-transformationskill — hard-rules-vs-prudential-calls split, quiet-investigate → propose-in-plain-language → iterate → build-verify-handback, audience built for a non-technical domain user.Three decisions worth calling out:
table_id) are avoided.Validation
Verified the three create-verbs and their
definitionshapes against a live staging instance (segment = flat MBQL 5 query; measure = single aggregation; metric = cardtype: metric). Doc claims linked inline.Changes
skill-data/semantic-layer/SKILL.md— auto-discovered by the skill loader's dir scan; no registration code.skill-data/core/SKILL.md— added to the specialized-skills list.README.md— bundled-skills table.tests/e2e/skills.e2e.test.ts— golden list (six → seven).typecheck+format:checkpass.