Skip to content

Roll up self-healing data collection stack#63

Closed
giaphutran12 wants to merge 41 commits into
mainfrom
codex/self-healing-stack-rollup
Closed

Roll up self-healing data collection stack#63
giaphutran12 wants to merge 41 commits into
mainfrom
codex/self-healing-stack-rollup

Conversation

@giaphutran12
Copy link
Copy Markdown
Collaborator

@giaphutran12 giaphutran12 commented May 23, 2026

Summary

  • consolidate the self-healing/data-collection stack into one draft PR against main
  • merge current origin/main into the rollup branch so reviewers do not have to walk the 30+ stacked PR chain
  • keep the old stacked PRs as review/audit history

Review protocol

This is an integration rollup/audit anchor, not a normal line-by-line PR. It is intentionally large. Review it by invariants:

  • /populate must go through the self-healing runner, not the old clear-then-agent path
  • bad candidates must be rejected and counted as benchmark failures
  • row commits must happen only after a successful tick
  • real-row commit damage must be capped at 100 rows/hour per dataset by default
  • recipe promotion must require source-backed rows and valid evidence
  • collection/Mengzhe runtime must stay adapter-shaped so its behavior can migrate into Mastra

What is included

  • Mastra populate runtime with source-backed row validation
  • self-healing recipe service, runner, CLI, cron-friendly dataset-id mode, and verifier
  • collection/Mengzhe runtime adapter and benchmark lane
  • process trace, Playwright readiness diagnostics, browser-action trace ingestion, and benchmark gates
  • Agent provenance diagnostics and rejected-candidate benchmark failure gate
  • commit-path row cap: default 100 rows/hour per dataset

Evidence

Local verification:

  • npm --prefix backend test
  • npm --prefix backend run build
  • git diff --check
  • make verify-self-healing

GitHub checks currently cover secret/vulnerability/review automation only; they do not replace the local backend gates above.

Restack simulation

Disposable worktree /private/tmp/bigset-restack-sim-20260523-0912 replayed child commits on top of this PR head:

  1. d818ba38b8785e07f7ec983717502645aa1f7171 (#64)
  2. 5154b4858b1b6356c766818c09b8e6da55bb587e (#65)

Replay was conflict-free. Verification on the simulated final stack:

  • targeted tests: 23/23 pass
  • npm --prefix backend run build: pass
  • git diff --check: pass
  • make verify-self-healing: pass, including backend tests 94/94

Notes

  • No auto-merge.
  • This branch resolves the main conflicts by keeping the self-healing /populate path and not restoring the old destructive clear-then-agent workflow as the HTTP populate path.
  • This still does not generate Playwright scripts. Next real work is producer-side explicit browser_actions/agent_browser_actions, then a compiler can turn those into replayable Playwright.
  • After this lands, restack child PRs onto fresh origin/main before merging them.

Simantak Dabhade and others added 30 commits May 21, 2026 21:07
Introduces the "Clear & Populate" flow: an AI agent (Claude Sonnet 4.6
via OpenRouter) searches the web using TinyFish APIs, fetches page
content, and inserts real data into datasets row by row.

Backend:
- Mastra populate workflow (clear rows → build prompt → run agent)
- Populate agent with 7 tools: 5 database CRUD (insert, list, get,
  update, delete) + 2 web (search_web via TinyFish Search API,
  fetch_page via TinyFish Fetch API)
- All tools return structured errors so the agent can self-correct
- Data keys are sanitized to strip stray quotes/backticks from LLM output
- Fetch responses capped at 15K chars to protect agent context window
- Convex client uses anyApi to avoid cross-project imports in Docker
- POST /populate route with Clerk JWT auth

Frontend:
- "Clear & Populate" button on dataset detail page
- API client function in lib/backend.ts
- Rows appear in realtime via Convex reactive queries

Convex:
- New internal functions: datasetRows.get (query) and datasetRows.remove
  (mutation) for single-row read/delete

Infra:
- TINYFISH_API_KEY wired through docker-compose.dev.yml to backend
  and mastra services

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Enforce dataset ownership on POST /populate by querying Convex for
  the dataset and comparing ownerId to req.auth.userId before running
  the workflow (fixes authz gap)
- Remove raw row payloads from insert_row/update_row logs, log column
  count instead to avoid PII leakage
- Add 30s AbortController timeouts to both TinyFish fetch calls in
  web-tools.ts so they can't hang indefinitely
- Align PopulateResult type (rows → result) to match actual backend
  response shape

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Convex query for dataset lookup can throw on invalid IDs — wrapping
it in the existing try/catch ensures controlled 400 responses instead
of unhandled 500s.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 23, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 753d0742-89c0-435f-b7cd-a33482c7d4d9

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/self-healing-stack-rollup

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@giaphutran12
Copy link
Copy Markdown
Collaborator Author

Closing stale draft cleanup PR; superseded by later BigSet work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant