feat: defer item tagging to an external agent (phase 2)#2
Open
jansitarski wants to merge 8 commits into
Open
Conversation
Add an explicit tagging lifecycle to clothing items so an external agent can own tagging when internal vision is disabled. - New StrEnums TaggingStatus (pending|tagged) and TaggedBy (auto|user|agent) backed by native PG enum types. - New columns tagging_status / tagged_by / tagged_at on clothing_items. - Reversible migration; existing rows are backfilled to tagged/auto so they never surface in the agent's pending work-queue (behavior unchanged).
Introduce a server-derived "write actor" so resource origin can record whether a write came from a human or an external agent, without ever trusting the request body. - Optional, signed `actor` claim on the token (absent => user). - Actor enum + get_current_actor/CurrentActor dependency; it depends on get_current_user so authentication is always enforced before classification. - create_access_token gains an optional actor argument. Token minting with actor="agent" is intentionally out of scope; the backend only honors a signed claim if one is present.
…rface
Make item tagging work with internal vision disabled and give the REST API a
first-class tagging read/write surface. Default behavior (vision on) is
unchanged.
- Worker auto-tag stamps tagged/auto on success.
- POST /items gains an auto_tag flag; all enqueue sites (single, bulk, and
re-analyze) are vision-guarded and leave items ready+pending when vision is
off instead of queuing a no-op job.
- GET /items?tagging_status=pending exposes the external tagger's work queue.
- PATCH /items/{id} that fills in a still-pending item's tags marks it tagged
with a server-derived origin; gated on pending so it is a one-way transition
and never rewrites an existing origin.
- POST /items/{id}/retag resets an item to the pending queue.
- ItemResponse exposes tagging_status / tagged_by / tagged_at (additive).
Cover the pending default and the auto-tag worker origin, the auto_tag/vision enqueue guards, the pending work-queue filter, the PATCH write-back origin (user vs agent, no body forgery, no rewrite of an existing origin), and the retag reset.
When a PATCH writes the tags block, project its attributes (pattern, material, style, season, formality, colors, primary_color) onto their first-class columns, matching the internal worker's dual-write. Previously a tags write-back populated only the JSONB, leaving the columns empty so externally-tagged items were invisible to column-based filters/scoring. Explicit top-level fields still take precedence; `fit` remains JSONB-only (it has no column).
POST /auth/sync mints a token carrying the signed actor="agent" claim when the request presents the configured AGENT_SYNC_KEY in the X-Wardrowbe-Agent-Key header (constant-time compared). This completes the agent-attribution path: such a client's writes record tagged_by/source=agent, while identity is still established by the normal sync auth. The key is a server-held deployment secret shared only with trusted agent clients; it is never derived from the request body, so a human user cannot self-elevate. Unset (default) => every synced token stays user-scoped (unchanged).
tagged_by now records only auto (internal worker) vs user (any human/external client editing via the API) — the distinction that actually drives the lifecycle and "needs review" UX. The separate agent value was provenance granularity that no feature consumed, and it carried real surface for no payoff: a signed JWT actor claim, get_current_actor, and an AGENT_SYNC_KEY shared-secret minting path at /auth/sync. Since tagged_by grants no authority, that machinery protected a label nobody reads. Removes the agent enum value, the actor claim + create_access_token actor arg, get_current_actor/Actor/CurrentActor, _resolve_sync_actor + AGENT_SYNC_KEY (config and .env.example), and the related tests. External-agent writes still record user, which is accurate (the actor is the authenticated principal). The enum can regain a value later via an additive migration if a feature ever needs to trust write provenance.
tagged_by is auto | manual. The backend can't distinguish a human editing in the app from an external agent writing via the API — both are authenticated API writes — so "user" over-claimed. "manual" names what is actually known: tags supplied through the API rather than produced by the internal auto-tagger, and it matches the codebase's existing auto-vs-manual vocabulary (OutfitSource.manual).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This is phase 2 of making the internal AI optional so the backend can run with internal generation disabled and defer work to an external agent (e.g. Claude via an MCP server). Phase 1 added the capability switches and
/capabilities; this phase makes item tagging a first-class, externally-ownable surface.Today tagging happens implicitly via the internal vision model, and there is no way to (a) leave an item untagged for something else to tag, or (b) record whether tags came from the machine or a person. This adds an explicit tagging lifecycle and a server-derived write origin:
clothing_items:tagging_status(pending|tagged),tagged_by(auto|manual),tagged_at, with native PG enum types. Existing rows are backfilled totagged/autoso nothing changes for current data.tagged/autoon success.POST /itemsgains anauto_tagflag. Every enqueue site (single create, bulk create, re-analyze) is vision-guarded: when internal vision is off (orauto_tag=false), the item is leftready+pendingfor an external tagger instead of queuing a no-op job.GET /items?tagging_status=pendingexposes the external tagger's work queue.PATCH /items/{id}that fills in a still-pendingitem's tags marks ittaggedwith a server-derived origin (manual). This is gated onpending, so it is a one-way transition and never re-stamps an already-tagged item. A tags write-back also projects its attributes onto their first-class columns (pattern,material,style,season,formality,colors,primary_color), keeping the column representation in sync with thetagsJSONB — parity with the internal worker, so externally-tagged items remain visible to column-based filters/scoring.POST /items/{id}/retagresets an item to the pending queue.Everything is additive and defaults to current behavior (internal vision on → items auto-tag exactly as before). The motivating consumers are external MCP servers that front this backend for an LLM; the design is provider-agnostic.
Related Issue
Related to Anyesh#99
Type of Change
Checklist
Testing
Test Environment
Tests Performed
backend/tests/test_item_tagging.py: pending default + auto-tag worker origin;auto_tag/vision enqueue guards; pending work-queue filter; PATCH write-back origin; no body forgery of origin; no re-stamp of an already-tagged item; tags→column projection; retag reset.335 passed).ruff checkandruff formatclean.Additional Notes
tagged_byis set from the authenticated principal: the worker isauto; API writes aremanual.tagging_status/tagged_by/tagged_atare intentionally absent from the writable schemas.tagged_byrecordsautovsmanualonly. An earlier iteration added a thirdagentorigin (signed JWTactorclaim + anAGENT_SYNC_KEYshared-secret mint path); it was dropped because no feature consumes write provenance andtagged_bygrants no authority, so the unforgeability machinery wasn't worth the surface. The enum can gain a value later via an additive migration if a feature ever needs to trust provenance.ItemResponsenow includestagging_status,tagged_by, andtagged_at. No existing field changes shape.