Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 11 additions & 1 deletion .claude-plugin/marketplace.json
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,20 @@
"email": "support@datarecce.io"
}
},
{
"name": "recce",
"source": "./plugins/recce",
"description": "Intelligent data review workflow for dbt developers",
"version": "0.2.0",
"author": {
"name": "DataRecce",
"email": "support@datarecce.io"
}
},
{
"name": "recce-dev",
"source": "./plugins/recce-dev",
"description": "Intelligent data review workflow for dbt developers",
"description": "Internal development and testing tools for the Recce project",
"version": "0.1.0",
"author": {
"name": "DataRecce",
Expand Down
4 changes: 2 additions & 2 deletions plugins/recce-dev/.claude-plugin/plugin.json
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
{
"name": "recce-dev",
"version": "0.1.0",
"description": "Intelligent data review workflow for dbt developers — tracks model changes and triggers progressive Recce validation",
"description": "Internal development and testing tools for the Recce project — MCP E2E validation, benchmarking, and plugin QA",
"author": {
"name": "DataRecce",
"url": "https://datarecce.io"
},
"homepage": "https://github.com/DataRecce/recce-claude-plugin",
"repository": "https://github.com/DataRecce/recce-claude-plugin",
"license": "MIT",
"keywords": ["recce", "dbt", "data-validation", "data-quality", "data-review"]
"keywords": ["recce", "testing", "e2e", "mcp-validation", "internal"]
}
28 changes: 6 additions & 22 deletions plugins/recce-dev/README.md
Original file line number Diff line number Diff line change
@@ -1,33 +1,17 @@
# recce-dev

Intelligent data review workflow for dbt developers.
Internal development and testing tools for the Recce project.

## What it does

recce-dev automatically tracks dbt model file changes and triggers progressive data validation using Recce. When you modify a dbt model, the plugin records the change. After your dbt run or build, it dispatches an agent that runs lineage diff, row count diff, and schema diff in sequence — producing an actionable summary with risk level before changes leave your machine.
This plugin provides tools for Recce developers to validate the `recce` plugin's MCP integration, benchmark agent performance, and run E2E validation flows. It is **not** intended for end users of Recce.

## Components

- **Skill:** `/recce-review` — triggers the data review workflow; dispatches the recce-reviewer agent with tracked model context
- **Agent:** `recce-reviewer` — runs progressive diff analysis (lineage, row count, schema) and produces a risk-assessed summary
- **Hooks:**
- `SessionStart` — detects dbt project environment and starts the Recce MCP server if prerequisites are met
- `PostToolUse` — suggests `/recce-check` after dbt run/build commands
- `PreToolUse` — tracks modified dbt model files before Write/Edit operations
- **MCP Servers:**
- `recce-dev` — Recce SSE server on `http://localhost:8081/sse` (local, project-scoped)
- `recce-docs` — Recce documentation stdio server (local path, for doc lookups)
- **Skill:** `/mcp-e2e-validate` — runs a full E2E validation of the `recce` plugin's event chain (SessionStart → model tracking → dbt suggestion → /recce-review → cleanup) and produces a performance benchmark report

## Requirements

- **Recce >= 1.39.0** installed in the project's virtual environment (`pip install "recce>=1.39.0"`) — SSE transport (`--sse` flag) requires this version
- The virtual environment must be activated before starting a Claude Code session so `recce` is on PATH
- dbt project with two environments configured (base + target) for comparison diffs
- Base artifacts generated: `dbt docs generate --target-path target-base` on the comparison branch

## Known Limitations

- **Port hardcoded in `.mcp.json`**: The MCP server URL is `http://localhost:8081/sse`. If you override `mcp_port` in settings (e.g., `.claude/recce-dev/settings.json`), the actual server starts on the configured port but `.mcp.json` still points to 8081. Claude Code MCP config is static — dynamic port resolution requires a future Claude Code feature.
- **Mid-session plugin install**: Installing the plugin mid-session does not activate hooks or MCP tools. Start a new Claude Code session after installation for full functionality.
- **recce-docs MCP path**: Uses a local symlink path (`../../packages/recce-docs-mcp/dist/cli.js`) that breaks after marketplace install. Deferred to v2 (MKTD-02).
- **HTTP-only MCP**: The `recce-dev` MCP server uses `http://localhost:8081/sse` (not HTTPS). This is expected for a local SSE server.
- The `recce` plugin must be installed alongside this plugin
- A dbt project with Recce configured (same requirements as the `recce` plugin)
- Recce installed in the project's virtual environment
173 changes: 173 additions & 0 deletions plugins/recce-dev/skills/mcp-e2e-validate/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,173 @@
---
name: mcp-e2e-validate
description: >
This skill should be used when the user asks to "validate MCP", "run E2E",
"benchmark MCP performance", "test the plugin flow", "compare MCP versions",
"驗證 MCP", "跑 E2E", or wants to verify the recce plugin's full event
chain (SessionStart → model tracking → dbt suggestion → /recce-review → cleanup)
works end-to-end and measure agent performance metrics.
version: 0.1.0
---

# /mcp-e2e-validate — MCP Integration E2E Validation & Benchmark

Validate the recce plugin's full event chain against a real dbt project and produce a performance benchmark report. Optionally compare against a baseline to quantify improvements across recce versions or PR changes.

**Dependencies:** This skill relies on the sibling `recce` plugin's scripts (`start-mcp.sh`, `stop-mcp.sh`, `check-mcp.sh`) and hooks (`track-changes.sh`, `suggest-review.sh`). It also dispatches the `recce-reviewer` agent.

**Cross-plugin path:** The `recce` plugin is a sibling under the same parent directory. Use `RECCE_PLUGIN_ROOT` (derived below) to reference its scripts:

```bash
Comment thread
iamcxa marked this conversation as resolved.
RECCE_PLUGIN_ROOT="${CLAUDE_PLUGIN_ROOT}/../recce"
```

---

## Inputs

Parse user input for optional parameters:

- **`--baseline`**: Previous benchmark metrics for comparison (e.g., `"tool_uses=35 tokens=30311 duration_s=483"`)
- **`--model`**: Model to edit for testing (default: first `.sql` file under `models/staging/`)
- **`--marker`**: Comment marker to inject (default: `-- recce-e2e-validation`)
- **`--skip-dbt`**: Skip the `dbt run` step if models were already built

If no parameters provided, use defaults and run the full flow.

---

## Step 1: Pre-flight

Run the pre-flight check script:

```bash
bash ${CLAUDE_PLUGIN_ROOT}/skills/mcp-e2e-validate/scripts/preflight.sh
```

Parse KEY=VALUE output. Abort if any `BLOCK=` line appears — show the message verbatim.

Handle warnings:
- `SSE_SUPPORT=false` → Inform user: editable install may need `rm -rf site-packages/recce/` then `pip install -e ".[mcp]"`. See memory for details.
- `PORT_STATUS=occupied_by_other` → Suggest changing port in `.claude/recce/settings.json`
- `STALE_FILES=found` → Auto-clean: `rm -f /tmp/recce-mcp-*.pid /tmp/recce-changed-*.txt`

Record `RECCE_VERSION` and `PORT` for the report.

---

## Step 2: Start MCP Server

Derive the recce plugin root and run:

```bash
RECCE_PLUGIN_ROOT="${CLAUDE_PLUGIN_ROOT}/../recce"
bash "${RECCE_PLUGIN_ROOT}/scripts/start-mcp.sh"
```

- If `STATUS=STARTED` or `STATUS=ALREADY_RUNNING` → record `PORT` and `PID`, proceed.
- If `ERROR=` → abort with error details.

Verify with health check:

```bash
bash "${RECCE_PLUGIN_ROOT}/scripts/check-mcp.sh"
```

Confirm `RUNNING=true` before proceeding.

---

## Step 3: Inject Test Edit (Tier 1 Trigger)

1. Select the target model file (from `--model` or default staging model).
2. Read the file and record its original content.
3. Append the marker comment (`-- recce-e2e-validation`) on a new line at the end.
4. Use the Edit tool (this triggers `track-changes.sh` PostToolUse hook).
5. Verify tracking:

```bash
PROJECT_HASH=$(printf '%s' "$PWD" | md5 2>/dev/null | cut -c1-8 || printf '%s' "$PWD" | md5sum | cut -c1-8)
cat /tmp/recce-changed-${PROJECT_HASH}.txt
```

Comment thread
kentwelcome marked this conversation as resolved.
- File exists and contains the edited model path → **Tier 1 PASS**
- File missing → **Tier 1 FAIL** (record and continue)

---

## Step 4: dbt Run (Tier 2 Trigger)

Skip if `--skip-dbt` was specified.

Run dbt on the modified model and downstream:

```bash
dbt run -s {model_name}+
```

- dbt completes with `PASS` → record model count. The `suggest-review.sh` hook should inject a review suggestion into context. **Tier 2 PASS**.
- dbt fails → **Tier 2 FAIL** (record error, continue to cleanup)

---

## Step 5: Dispatch Review Agent

Dispatch the `recce-reviewer` agent with the tracked model context:

> "Changed models (from tracked file): {model_name}. Focus review on these models using selector: {model_name}+"

**Capture the full agent result**, including the `<usage>` block. Extract:
- `tool_uses` — number of MCP tool calls
- `total_tokens` — total token consumption
- `duration_ms` — wall-clock time

Check agent output for `## Data Review Summary`. Validate against pass criteria in `references/pass-criteria.md`:
- Concrete row count numbers (non-zero integers)
- Risk level present (LOW/MEDIUM/HIGH)
- Model names in summary
- No MCP tool errors

---

## Step 6: Cleanup

Execute in order:

1. **Revert model edit** — restore the file to its original content (remove marker comment).
2. **Stop MCP server**:
```bash
RECCE_PLUGIN_ROOT="${CLAUDE_PLUGIN_ROOT}/../recce"
bash "${RECCE_PLUGIN_ROOT}/scripts/stop-mcp.sh"
```
3. **Clean tracked files**:
```bash
PROJECT_HASH=$(printf '%s' "$PWD" | md5 2>/dev/null | cut -c1-8 || printf '%s' "$PWD" | md5sum | cut -c1-8)
rm -f "/tmp/recce-changed-${PROJECT_HASH}.txt"
```
4. **Stale state check** — verify no `/tmp/recce-mcp-*.pid` or `/tmp/recce-changed-*.txt` remain.

---

## Step 7: Produce Benchmark Report

Generate the report using the template in `references/pass-criteria.md`.

If `--baseline` was provided, compute deltas:
- `delta = current - baseline`
- `delta_pct = (delta / baseline) * 100`

Present negative deltas (improvements) with emphasis.

Output the full report to the user. If all pass criteria are met, end with **Verdict: PASS**. Otherwise list failures.

---

## Additional Resources

### Reference Files

- **`references/pass-criteria.md`** — Detailed pass/fail criteria per section, performance metrics extraction guide, and the benchmark report template.

### Scripts

- **`scripts/preflight.sh`** — Pre-flight environment checks (dbt project, recce version, SSE support, port availability, stale files). Outputs KEY=VALUE lines.
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# E2E Pass Criteria

## Section-Level Checks

| Section | Check | Pass Condition |
|---------|-------|----------------|
| Pre-flight | dbt project detected | `DBT_PROJECT=true` |
| Pre-flight | recce installed with SSE | `SSE_SUPPORT=true` |
| Pre-flight | Artifacts exist | `TARGET_EXISTS=true` AND `TARGET_BASE_EXISTS=true` |
| MCP Startup | start-mcp.sh succeeds | `STATUS=STARTED` or `STATUS=ALREADY_RUNNING` |
| MCP Health | check-mcp.sh confirms | `RUNNING=true` |
| Tier 1 Track | Edit hook records model | File `/tmp/recce-changed-{hash}.txt` contains edited model path |
| Tier 2 Suggest | dbt run triggers suggestion | Hook injects "Consider running /recce-review" context |
| Review Agent | Summary produced | Output contains `## Data Review Summary` |
| Review Agent | Concrete row counts | At least one model shows non-zero integer in both base and current |
| Review Agent | Risk level present | Summary contains `LOW`, `MEDIUM`, or `HIGH` |
| Review Agent | Model names present | Changed model name appears in summary |
| Review Agent | No MCP errors | All MCP tool calls complete without connection/timeout errors |
| Cleanup | Model reverted | Edited file restored to original |
| Cleanup | MCP stopped | stop-mcp.sh returns `STATUS=STOPPED` |
| Stale State | No leftovers | No `/tmp/recce-mcp-*.pid` or `/tmp/recce-changed-*.txt` remaining |

## Performance Metrics to Capture

From the review agent dispatch result, extract:

| Metric | Source | Format |
|--------|--------|--------|
| `tool_uses` | Agent result `<usage>` block | Integer |
| `total_tokens` | Agent result `<usage>` block | Integer |
| `duration_ms` | Agent result `<usage>` block | Integer (convert to seconds for display) |

## Benchmark Report Template

```markdown
## MCP E2E Benchmark Report

**Date:** {YYYY-MM-DD}
**recce version:** {version}
**Project:** {dbt_project_name}
**Environment:** {adapter_type} (dual-env | single-env)
**Test model:** {model_name}

### Event Chain Results

| Step | Result | Notes |
|------|--------|-------|
| Pre-flight | {PASS/FAIL} | {details} |
| MCP Startup | {PASS/FAIL} | Port {port}, PID {pid} |
| Tier 1 Tracking | {PASS/FAIL} | |
| Tier 2 Suggestion | {PASS/FAIL} | |
| Review Agent | {PASS/FAIL} | Risk: {level} |
| Cleanup | {PASS/FAIL} | |

### Agent Performance

| Metric | Value |
|--------|-------|
| Tool calls | {N} |
| Tokens consumed | {N} |
| Wall-clock time | {N}s |

### Comparison (if baseline provided)

| Metric | Baseline | Current | Delta |
|--------|----------|---------|-------|
| Tool calls | {N} | {N} | {±N} ({±%}) |
| Tokens | {N} | {N} | {±N} ({±%}) |
| Time | {N}s | {N}s | {±N}s ({±%}) |

### Data Review Summary

{Paste the agent's ## Data Review Summary output here}

### Verdict: {PASS / FAIL}
{If FAIL: list which criteria failed}
```
Loading