Skip to content

feat: make internal AI optional and add capabilities endpoint#113

Open
jansitarski wants to merge 5 commits into
Anyesh:mainfrom
jansitarski:jansitarski/disable-internal-llm
Open

feat: make internal AI optional and add capabilities endpoint#113
jansitarski wants to merge 5 commits into
Anyesh:mainfrom
jansitarski:jansitarski/disable-internal-llm

Conversation

@jansitarski

Copy link
Copy Markdown

Description

Wardrowbe produces item tagging, outfit suggestions, and pairings through its internal LLM, with no way to turn that off or supply those results externally. This is PR 1 of 4 that lets the backend run with internal AI disabled and defer that work to an external agent (e.g. an MCP server such as saya6k/mcp-wardrowbe or jansitarski/wardrowbe-mcp) while every endpoint keeps working. The design is provider-agnostic. Everything is additive and defaults to internal AI on, so existing deployments are unaffected.

What this PR does

  • Capability switchesAI_INTERNAL_ENABLED (master, default true) plus AI_VISION_ENABLED / AI_TEXT_ENABLED, which inherit the master when unset (a false master forces both off). Wired through .env.example, both compose files, and the k8s manifests so the switch actually reaches the containers.
  • Guardsrequire_internal_ai(...) raises before any client is constructed; get_ai_service() and AIService.__init__ are backstops. With AI off the app boots cleanly with no AI_BASE_URL/AI_API_KEY/models set.
  • Enforced at every consumer — recommendation/pairing services guard first (so deferral is unconditional); the tagging worker skips and leaves items ready (untagged); the worker skips AI init at startup; scheduled notifications skip cleanly; POST /outfits/suggest and POST /pairings/generate/{id} return a typed 503 ("deferred to an external agent") instead of a 500 or a hang.
  • IntrospectionGET /api/v1/capabilities reports the effective state (public, no user data); GET /health/ai degrades to disabled when off.
{ "ai": { "vision": false, "text": false },
  "features": { "external_tagging": true, "external_suggestions": true, "external_pairings": true },
  "version": "1.0.0" }

Roadmap — this is PR 1 of 4

Each PR is additive, independently reviewable, and defaults to current behavior:

  1. Foundation (this PR) — capability switches, guards enforced at every consumer, GET /capabilities.
  2. Item taggingtagging_status/tagged_by/tagged_at state, an auto_tag ingest flag plus the enqueue-site guard (so vision-off uploads are never queued), a ?tagging_status=pending work queue, and write-back/retag with server-derived agent identity.
  3. Deferred resources — replace the interim 503 with well-formed deferred responses, and add agent-authoring for suggestions and pairings (source=agent).
  4. Compose — nullable outfit attribute columns (season/formality/palette/notes) and agent-authored outfit composition.

Related Issue

Related to #99 (Built-in MCP server support). This series exposes the API surface any external MCP needs; it does not add a built-in server.

Type of Change

  • New feature (non-breaking change that adds functionality)

Checklist

  • I have read the CONTRIBUTING guide
  • My code follows the project's coding style
  • I have added tests that prove my fix/feature works
  • New and existing tests pass locally
  • I have updated documentation as needed
  • My changes don't introduce new warnings or errors

Testing

Full backend suite — 300 tests, all passing against Postgres + Redis. Coverage includes the flag-resolution truth table; the guards (per-capability, the vision-off/text-on isolation case, and the __init__ backstop); the tagging worker skip and startup skip; /capabilities and /health/ai on and off; the typed 503 for suggest/pairings; and fail-fast ordering (generate_recommendation/generate_pairings defer before any location/item validation).

The two background-removal tests are excluded — they require the optional rembg dependency and are unrelated to this change.

Test Environment

  • Docker Compose
  • Kubernetes
  • Local development

Tests Performed

  • Full backend suite (300 tests) green against Postgres + Redis.
  • Deployed to a k3s homelab cluster and exercised both states:
    • Internal AI on (default): tagging, suggestions, and pairings behave exactly as before.
    • Internal AI off (AI_INTERNAL_ENABLED=false): no AI client is constructed; uploads land ready (untagged); POST /outfits/suggest and POST /pairings/generate/{id} return the deferred 503; GET /api/v1/capabilities reports ai.vision/ai.text as false.

Additional Notes

  • The capabilities version is hardcoded "1.0.0" (matching the existing FastAPI app version), kept hardcoded to avoid pulling release-please version wiring into this PR.
  • The features flags are static true: they describe the external-agent surface the series delivers; the dedicated source=agent authoring paths land in PRs 2–3.
  • require_internal_ai("vision") is implemented and tested but currently enforced through the worker's direct check; it's the hook PR 2 uses at the enqueue site.

…abilities

Add AI_INTERNAL_ENABLED (master, default true) plus AI_VISION_ENABLED and
AI_TEXT_ENABLED, which inherit the master when unset; effective flags resolve to
vision = internal and vision, text = internal and text, with a false master
forcing both off. Expose the effective state at GET /api/v1/capabilities so an
external agent can decide whether to drive tagging/suggestions/pairings itself,
and degrade GET /health/ai to "disabled" instead of probing endpoints when AI is
off. Defaults preserve current behavior, so existing deployments are unaffected.
Add AIDisabledError and require_internal_ai(capability), and guard both
get_ai_service() and AIService.__init__ so a client is never constructed against
absent configuration. Per-capability call sites guard with require_internal_ai;
the constructor check is the backstop. Booting with internal AI off and no
AI_BASE_URL/AI_API_KEY/models set now succeeds cleanly.
Enforce the switches everywhere the internal model is reached, so no client is
built and no provider is called when a capability is off:

- recommendation and pairing services guard text first, so deferral is
  unconditional rather than shadowed by location/weather/item validation;
- the tagging worker skips when vision is off and leaves the item ready
  (untagged, usable) instead of error;
- the worker skips AI init and the health probe at startup when AI is off;
- the scheduled-notification worker treats AIDisabledError as a clean skip
  rather than a retried failure;
- POST /outfits/suggest and POST /pairings/generate/{id} map AIDisabledError to
  a typed 503 ("deferred to an external agent") instead of a 500 or a hang.
Cover the flag-resolution truth table; the disabled guard at get_ai_service,
AIService.__init__, and require_internal_ai (including vision-off/text-on
isolation); the tagging worker skip and startup skip; the /capabilities and
/health/ai endpoints on and off; and the typed 503 for suggest/pairings. Also
assert generate_recommendation and generate_pairings defer up front (before
location/item validation), locking the guard ordering.
The AI_INTERNAL_ENABLED / AI_VISION_ENABLED / AI_TEXT_ENABLED switches were
documented in .env.example and the README but never passed to the containers,
so the documented "disable internal AI" path was inert under both docker compose
and Kubernetes. Forward them to the backend and worker in docker-compose.yml and
docker-compose.prod.yml (defaulting to on), and add them to the k8s ConfigMap and
the backend/worker Deployments as optional refs so existing ConfigMaps without
the keys still start. Defaults preserve current behavior.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant