feat: defer item tagging to an external agent (phase 2) by jansitarski · Pull Request #2 · jansitarski/wardrowbe

jansitarski · 2026-06-19T17:54:44Z

Description

This is phase 2 of making the internal AI optional so the backend can run with internal generation disabled and defer work to an external agent (e.g. Claude via an MCP server). Phase 1 added the capability switches and /capabilities; this phase makes item tagging a first-class, externally-ownable surface.

Today tagging happens implicitly via the internal vision model, and there is no way to (a) leave an item untagged for something else to tag, or (b) record whether tags came from the machine or a person. This adds an explicit tagging lifecycle and a server-derived write origin:

New fields on clothing_items: tagging_status (pending|tagged), tagged_by (auto|manual), tagged_at, with native PG enum types. Existing rows are backfilled to tagged/auto so nothing changes for current data.
Worker auto-tag stamps tagged/auto on success.
POST /items gains an auto_tag flag. Every enqueue site (single create, bulk create, re-analyze) is vision-guarded: when internal vision is off (or auto_tag=false), the item is left ready + pending for an external tagger instead of queuing a no-op job.
GET /items?tagging_status=pending exposes the external tagger's work queue.
PATCH /items/{id} that fills in a still-pending item's tags marks it tagged with a server-derived origin (manual). This is gated on pending, so it is a one-way transition and never re-stamps an already-tagged item. A tags write-back also projects its attributes onto their first-class columns (pattern, material, style, season, formality, colors, primary_color), keeping the column representation in sync with the tags JSONB — parity with the internal worker, so externally-tagged items remain visible to column-based filters/scoring.
POST /items/{id}/retag resets an item to the pending queue.

Everything is additive and defaults to current behavior (internal vision on → items auto-tag exactly as before). The motivating consumers are external MCP servers that front this backend for an LLM; the design is provider-agnostic.

Related Issue

Related to Anyesh#99

Type of Change

New feature (non-breaking change that adds functionality)

Checklist

I have read the CONTRIBUTING guide
My code follows the project's coding style
I have added tests that prove my fix/feature works
New and existing tests pass locally
I have updated documentation as needed
My changes don't introduce new warnings or errors

Testing

Test Environment

Docker Compose
Local development

Tests Performed

New backend/tests/test_item_tagging.py: pending default + auto-tag worker origin; auto_tag/vision enqueue guards; pending work-queue filter; PATCH write-back origin; no body forgery of origin; no re-stamp of an already-tagged item; tags→column projection; retag reset.
Full backend suite green (335 passed).
Migration upgrade and downgrade verified; ruff check and ruff format clean.
End-to-end validated against a live instance (internal vision off) via an MCP client: create→pending, work-queue listing, tag write-back, column population, retag re-queue, and the one-way origin gate.

Additional Notes

Write origin is server-derived, never trusted from the body. tagged_by is set from the authenticated principal: the worker is auto; API writes are manual. tagging_status/tagged_by/tagged_at are intentionally absent from the writable schemas.
Scope note: tagged_by records auto vs manual only. An earlier iteration added a third agent origin (signed JWT actor claim + an AGENT_SYNC_KEY shared-secret mint path); it was dropped because no feature consumes write provenance and tagged_by grants no authority, so the unforgeability machinery wasn't worth the surface. The enum can gain a value later via an additive migration if a feature ever needs to trust provenance.
Additive response change: ItemResponse now includes tagging_status, tagged_by, and tagged_at. No existing field changes shape.
Part of a series exposing the API surface an external agent needs (tagging here; suggestions/pairings and outfit compose to follow). Does not add a built-in MCP server, so [Feature]: Built-in MCP server support Anyesh/wardrowbe#99 is referenced, not closed.

Add an explicit tagging lifecycle to clothing items so an external agent can own tagging when internal vision is disabled. - New StrEnums TaggingStatus (pending|tagged) and TaggedBy (auto|user|agent) backed by native PG enum types. - New columns tagging_status / tagged_by / tagged_at on clothing_items. - Reversible migration; existing rows are backfilled to tagged/auto so they never surface in the agent's pending work-queue (behavior unchanged).

Introduce a server-derived "write actor" so resource origin can record whether a write came from a human or an external agent, without ever trusting the request body. - Optional, signed `actor` claim on the token (absent => user). - Actor enum + get_current_actor/CurrentActor dependency; it depends on get_current_user so authentication is always enforced before classification. - create_access_token gains an optional actor argument. Token minting with actor="agent" is intentionally out of scope; the backend only honors a signed claim if one is present.

…rface Make item tagging work with internal vision disabled and give the REST API a first-class tagging read/write surface. Default behavior (vision on) is unchanged. - Worker auto-tag stamps tagged/auto on success. - POST /items gains an auto_tag flag; all enqueue sites (single, bulk, and re-analyze) are vision-guarded and leave items ready+pending when vision is off instead of queuing a no-op job. - GET /items?tagging_status=pending exposes the external tagger's work queue. - PATCH /items/{id} that fills in a still-pending item's tags marks it tagged with a server-derived origin; gated on pending so it is a one-way transition and never rewrites an existing origin. - POST /items/{id}/retag resets an item to the pending queue. - ItemResponse exposes tagging_status / tagged_by / tagged_at (additive).

Cover the pending default and the auto-tag worker origin, the auto_tag/vision enqueue guards, the pending work-queue filter, the PATCH write-back origin (user vs agent, no body forgery, no rewrite of an existing origin), and the retag reset.

When a PATCH writes the tags block, project its attributes (pattern, material, style, season, formality, colors, primary_color) onto their first-class columns, matching the internal worker's dual-write. Previously a tags write-back populated only the JSONB, leaving the columns empty so externally-tagged items were invisible to column-based filters/scoring. Explicit top-level fields still take precedence; `fit` remains JSONB-only (it has no column).

POST /auth/sync mints a token carrying the signed actor="agent" claim when the request presents the configured AGENT_SYNC_KEY in the X-Wardrowbe-Agent-Key header (constant-time compared). This completes the agent-attribution path: such a client's writes record tagged_by/source=agent, while identity is still established by the normal sync auth. The key is a server-held deployment secret shared only with trusted agent clients; it is never derived from the request body, so a human user cannot self-elevate. Unset (default) => every synced token stays user-scoped (unchanged).

tagged_by now records only auto (internal worker) vs user (any human/external client editing via the API) — the distinction that actually drives the lifecycle and "needs review" UX. The separate agent value was provenance granularity that no feature consumed, and it carried real surface for no payoff: a signed JWT actor claim, get_current_actor, and an AGENT_SYNC_KEY shared-secret minting path at /auth/sync. Since tagged_by grants no authority, that machinery protected a label nobody reads. Removes the agent enum value, the actor claim + create_access_token actor arg, get_current_actor/Actor/CurrentActor, _resolve_sync_actor + AGENT_SYNC_KEY (config and .env.example), and the related tests. External-agent writes still record user, which is accurate (the actor is the authenticated principal). The enum can regain a value later via an additive migration if a feature ever needs to trust write provenance.

tagged_by is auto | manual. The backend can't distinguish a human editing in the app from an external agent writing via the API — both are authenticated API writes — so "user" over-claimed. "manual" names what is actually known: tags supplied through the API rather than produced by the internal auto-tagger, and it matches the codebase's existing auto-vs-manual vocabulary (OutfitSource.manual).

jansitarski added 8 commits June 19, 2026 19:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: defer item tagging to an external agent (phase 2)#2

feat: defer item tagging to an external agent (phase 2)#2
jansitarski wants to merge 8 commits into
mainfrom
jansitarski/external-item-tagging

jansitarski commented Jun 19, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jansitarski commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issue

Type of Change

Checklist

Testing

Test Environment

Tests Performed

Additional Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jansitarski commented Jun 19, 2026 •

edited

Loading