Agent tools accept arbitrary IDs from LLM with admin privileges — no dataset scoping

## Summary

The populate agent's dataset tools (`insert_row`, `update_row`, `delete_row`) run with the Convex admin key but trust IDs provided by the LLM without verifying they belong to the dataset the workflow was authorized for.

## Details

The `/populate` endpoint in `backend/src/index.ts` correctly checks ownership:

```ts
const dataset = await convex.query(api.datasets.get, { id: parsed.data.datasetId });
if (dataset.ownerId !== req.auth.userId) {
  return reply.code(403).send({ error: "Not authorized to populate this dataset" });
}
```

But once the workflow starts, the agent tools don't enforce this boundary:

- **`insert_row`** (`backend/src/mastra/tools/dataset-tools.ts`) takes `datasetId` directly from LLM output and passes it to `internal.datasetRows.insert` via admin key. Nothing validates it matches the workflow's authorized dataset.
- **`update_row`** takes a `rowId` and calls `internal.datasetRows.update` — no check that the row belongs to the authorized dataset.
- **`delete_row`** takes a `rowId` and calls `internal.datasetRows.remove` (`ctx.db.delete(args.id)`) — no dataset scoping at all.

## Attack vector

The populate agent ingests web search results into its context (via `search_web` and `fetch_page` tools). A malicious website crafted to appear in search results could embed prompt injection payloads that instruct the agent to:

1. Insert rows into a different user's dataset (consuming their quota)
2. Update rows in a different dataset
3. Delete rows from a different dataset

All three succeed because the admin key bypasses all Convex auth checks and the tools don't validate dataset membership.

## Suggested fix

Pass the authorized `datasetId` into the tool execution context (e.g., via Mastra's tool context or closure) and validate inside each tool:

- `insert_row`: assert `datasetId === authorizedDatasetId`
- `update_row`: after fetching the row, assert `row.datasetId === authorizedDatasetId`
- `delete_row`: after fetching the row, assert `row.datasetId === authorizedDatasetId`

This keeps the admin key's power scoped to the dataset the user actually owns.

## Affected code

- `backend/src/mastra/tools/dataset-tools.ts` (all write tools)
- `frontend/convex/datasetRows.ts` (`update`, `remove` mutations — no dataset ownership check)

Introduced in #26.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Agent tools accept arbitrary IDs from LLM with admin privileges — no dataset scoping #68

Summary

Details

Attack vector

Suggested fix

Affected code

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Agent tools accept arbitrary IDs from LLM with admin privileges — no dataset scoping #68

Description

Summary

Details

Attack vector

Suggested fix

Affected code

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions