diff --git a/docs/audits/master-audit-2026-05-22.md b/docs/audits/master-audit-2026-05-22.md new file mode 100644 index 00000000..6ba6fa48 --- /dev/null +++ b/docs/audits/master-audit-2026-05-22.md @@ -0,0 +1,142 @@ +# Agora Master Audit (2026-05-22) + +## Area 1 — Code Quality Findings + +### Finding: Divergent provider naming across contracts increases runtime mismatch risk +- File: `backend/app/services/llm_runtime.py` +- Lines: 20-29, 115-120 +- Risk: high +- Problem: Runtime config aliases map `gemini -> google`, while other API/contracts expose both `gemini` and `google` conventions. +- Evidence: `_PROVIDER_ALIASES` normalizes to `google`; invalid-provider error message only lists `google`, not `gemini`. +- Recommended fix: Introduce one canonical provider enum in shared contract (`ollama|openai|gemini|custom_openai`), and normalize all route and registry layers to it. + +### Finding: Provider metadata hard-codes stale model defaults +- File: `backend/app/services/llm_provider_registry.py` +- Lines: 43, 52 +- Risk: medium +- Problem: Fallback models include legacy names (`o1-preview`, `gemini-1.5-*`) that can drift from actually supported models. +- Evidence: Static list literals are returned to UI with no freshness validation. +- Recommended fix: Move defaults into versioned config and validate with `/models` probe + health-cache. + +### Finding: API error envelope is inconsistent with modern typed error envelope goal +- File: `backend/app/utils/api_responses.py` +- Lines: 149-156 +- Risk: high +- Problem: Error shape uses `{ success:false, error:string, code?:string }`, while frontend proposals and typed handling expect nested error object. +- Evidence: `json_error()` writes a flat `error` string; no provider/retryable metadata. +- Recommended fix: Add v2 envelope (`{error:{code,message,provider,retryable,details}}`) and migrate endpoints incrementally. + +### Finding: Frontend transport layer logs full errors in interceptors +- File: `frontend/src/api/index.ts` +- Lines: 58-59, 84, 109, 129 +- Risk: medium +- Problem: `console.error` is called with potentially rich error objects from backend responses. +- Evidence: raw error objects and wrapped response bodies are printed. +- Recommended fix: Route through centralized redacted logger utility and strip provider/base-url/response payload details in production. + +### Finding: Large LLM client module mixes multiple responsibilities +- File: `backend/app/utils/llm_client.py` +- Lines: 1-1633 +- Risk: medium +- Problem: Single file handles schema flattening, provider heuristics, retries, streaming parsing, tools, and vision. +- Evidence: 1600+ LOC with many concerns and provider-specific branches. +- Recommended fix: Split into provider adapters + shared transport + schema utilities. + +## Area 2 — Simplification / Maintainability + +### Proposal: Extract model context heuristics into one shared module +- Current file/function: `backend/app/utils/llm_client.py` `_MODEL_CONTEXT_HEURISTICS`; comment references duplication in scripts. +- What can be simplified: Centralize heuristics table used by runtime and scripts. +- Why safe: Read-only lookup logic; behavior preserved by importing same constant. +- Before/after: + - Before: duplicated tuple in app and script paths. + - After: `app/llm/context_limits.py` imported by both. +- Follow-up issue: yes. + +### Proposal: Consolidate SSE reconnect constants and behavior +- Current files: `backend/app/api/logs.py`, `backend/app/api/simulation_stream.py`, `backend/app/api/settings.py`, frontend `useEventStream.ts`. +- What can be simplified: Duplicate retry/poll constants and reconnect logic. +- Why safe: Transport-only behavior; can preserve existing defaults. +- Before/after: Shared SSE config helper and explicit reconnect policy docs/tests. +- Follow-up issue: yes. + +### Proposal: Consolidate API error handling to a typed frontend helper +- Current files: `frontend/src/api/index.ts`, `frontend/src/api/envelope.ts`. +- What can be simplified: Duplicate code paths for success:false and Axios error path. +- Why safe: Same thrown `ApiError`, reduced branching. +- Before/after: `parseApiError(responseOrAxiosError)` utility. +- Follow-up issue: yes. + +## Area 3 — Frontend ↔ Backend Alignment + +### Mismatch: Provider naming (`gemini` vs `google`) leaks through UI/backend boundaries +- Frontend reference: `frontend/src/contracts/llmProfileContract.ts` provider literal includes `gemini`. +- Backend reference: `backend/app/services/llm_runtime.py` alias and normalization to `google`; `backend/app/services/llm_provider_registry.py` uses id/type `google`. +- Expected contract: One stable provider identifier across payloads and runtime routing. +- Actual frontend behavior: Can emit `gemini` in profile/runtime data. +- Actual backend behavior: Converts to `google`, but surfaces mixed naming externally. +- Fix: Canonicalize to `gemini` in API contracts and keep internal mapping only at adapter boundary. + +### Mismatch: Error envelope semantics differ between backend and frontend target +- Frontend reference: `frontend/src/api/index.ts` custom wrapper expects flat success/error envelope and builds `ApiError` heuristically. +- Backend reference: `backend/app/utils/api_responses.py` only returns flat error plus optional `code`. +- Expected contract: Consistent `ApiError` envelope with provider/retryable/details. +- Actual frontend behavior: infers timeout/network locally; retryability not authoritative. +- Actual backend behavior: lacks structured provider/retryable fields. +- Fix: Introduce backend error normalizer and generated TS type from backend schema. + +### Mismatch: Abort/cancellation semantics are inconsistent +- Frontend reference: axios client in `frontend/src/api/index.ts` lacks default AbortSignal propagation helper. +- Backend reference: SSE endpoints rely on connection close, but standard POST endpoints do not expose cancellation status semantics. +- Expected contract: Predictable abort behavior for long-running LLM-triggered routes. +- Actual frontend behavior: mixed (SSE reconnect logic exists; request cancel not standardized). +- Actual backend behavior: long-run operations mainly rely on timeout/errors. +- Fix: add cancellable request wrappers on frontend and cancellation-aware API contract for long operations. + +## Area 4 — Provider Integration Audit + +### Provider: Ollama Local +- Files inspected: `backend/app/utils/llm_client.py`, `backend/app/services/llm_runtime.py`, `backend/app/services/llm_provider_registry.py`. +- Current request flow: OpenAI-compatible client, with local profile shortcut allowing missing key and default `api_key='ollama'`. +- Current response flow: mostly `chat.completions`; some direct httpx call path for schema chat. +- Current streaming behavior: force-stream path for ollama cloud/non-local variants in chat and tools code paths. +- Current auth/config behavior: key optional for local URLs, base URL from profile/runtime/registry. +- Failure points: mixed transport paths (OpenAI SDK + raw httpx) produce inconsistent error mapping; model context heuristics may be stale. +- Recommended fixes: dedicated Ollama adapter, normalized errors, consolidated streaming parser, explicit local/cloud mode. + +### Provider: OpenAI +- Files inspected: `backend/app/utils/llm_client.py`, `backend/app/services/llm_runtime.py`. +- Current request flow: OpenAI SDK `chat.completions.create` with retry wrapper. +- Current response flow: post-processing for text/tool calls and strict schema fallback logic. +- Current streaming behavior: conditional stream in some branches. +- Current auth/config behavior: resolved via runtime config, active provider store, or env fallback. +- Failure points: key-source/base-url precedence is complex and hard to reason about; strict-schema fallbacks are embedded and not unit-separated. +- Recommended fixes: deterministic precedence matrix + per-provider config validator + isolated schema transformer unit tests. + +### Provider: Google Gemini +- Files inspected: `backend/app/services/llm_runtime.py`, `backend/app/services/llm_provider_registry.py`, `backend/app/utils/llm_client.py`. +- Current request flow: routed through OpenAI-compatible Gemini endpoint. +- Current response flow: shares OpenAI parsing branch. +- Current streaming behavior: inherited from shared chat completion path. +- Current auth/config behavior: prefix-check for `AIzaSy` when provider is `google`. +- Failure points: nomenclature mismatch (`gemini` vs `google`) + OpenAI-compat behavior hidden behind generic path can mask provider-specific errors. +- Recommended fixes: explicit Gemini adapter wrapper with provider-specific error codes and test fixtures. + +## Unified Provider Abstraction (proposed) +1. Add `backend/app/services/llm/providers/base.py` interface with `complete`, `stream`, `validate_config`. +2. Add adapters: `ollama_provider.py`, `openai_provider.py`, `gemini_provider.py`. +3. Keep one normalized request/response model in `backend/app/contracts/llm_provider_runtime_contract.py`. +4. Centralize error normalization (provider, code, retryable, status, details). +5. Keep retry policy in one layer (transport), not in each adapter. +6. Add integration tests with mocked provider responses for non-streaming, streaming, timeout, 401, and malformed chunk. + +## Prioritized Implementation Plan +1. **Critical:** unify provider naming + error envelope contract (high risk reduction). +2. **High:** extract provider adapters from `llm_client.py` and freeze precedence rules for key/base_url/model. +3. **High:** frontend/backend cancellation + retry semantics alignment. +4. **Medium:** de-duplicate context-limit heuristics and SSE reconnect config. +5. **Medium:** model-catalog freshness checks and stale fallback model cleanup. +6. **Medium:** expand regression tests around provider-specific failures. + +## Suggested follow-up issue set (created as local issue drafts) +See `docs/issues/` files prefixed `2026-05-22-`. diff --git a/docs/issues/2026-05-22-error-envelope-v2.md b/docs/issues/2026-05-22-error-envelope-v2.md new file mode 100644 index 00000000..50aab642 --- /dev/null +++ b/docs/issues/2026-05-22-error-envelope-v2.md @@ -0,0 +1,21 @@ +# [Jules][bug][backend][frontend] Introduce normalized API error envelope v2 and migrate key routes + +Master issue: Master audit 2026-05-22. + +## Problem +Backend emits flat error envelopes; frontend infers retryability/timeouts heuristically. + +## Evidence +- `backend/app/utils/api_responses.py` lines 149-156 +- `frontend/src/api/index.ts` lines 71-131 + +## Scope +- Add `{ error: { code, message, provider?, retryable?, details? } }` envelope +- Keep temporary compatibility for old clients +- Update frontend parser to trust backend error metadata + +## Acceptance Criteria +- [ ] Backend helper emits v2 envelope for migrated routes +- [ ] Frontend consumes v2 and preserves UX messaging +- [ ] Timeout/retryable/provider fields set for provider errors +- [ ] API contract docs updated diff --git a/docs/issues/2026-05-22-llm-client-split.md b/docs/issues/2026-05-22-llm-client-split.md new file mode 100644 index 00000000..cfdbf49e --- /dev/null +++ b/docs/issues/2026-05-22-llm-client-split.md @@ -0,0 +1,20 @@ +# [Jules][refactor][provider][backend][testing] Split monolithic `llm_client.py` into provider adapters + shared transport + +Master issue: Master audit 2026-05-22. + +## Problem +`llm_client.py` mixes schema transforms, retries, provider logic, streaming parsing, and tool-call handling. + +## Evidence +- `backend/app/utils/llm_client.py` lines 1-1633 + +## Scope +- Introduce adapter interface and concrete providers (Ollama/OpenAI/Gemini) +- Move retry and error normalization to shared transport layer +- Keep behavior parity via tests + +## Acceptance Criteria +- [ ] `llm_client.py` responsibilities reduced substantially +- [ ] Provider adapters have dedicated unit tests +- [ ] Streaming and non-streaming paths validated for each provider +- [ ] No regression in existing LLM client tests diff --git a/docs/issues/2026-05-22-provider-naming-unification.md b/docs/issues/2026-05-22-provider-naming-unification.md new file mode 100644 index 00000000..e551796f --- /dev/null +++ b/docs/issues/2026-05-22-provider-naming-unification.md @@ -0,0 +1,22 @@ +# [Jules][provider][backend][frontend] Unify provider naming (`gemini` vs `google`) across contracts and runtime + +Master issue: Master audit 2026-05-22. + +## Problem +Provider naming is inconsistent between frontend and backend runtime/provider registry. + +## Evidence +- `backend/app/services/llm_runtime.py` lines 20-29, 115-120 +- `backend/app/services/llm_provider_registry.py` lines 46-49 +- `frontend/src/contracts/llmProfileContract.ts` provider literal set + +## Scope +- Canonical provider enum in shared contract +- Backward-compatible input aliasing +- API responses standardized to canonical names + +## Acceptance Criteria +- [ ] One canonical provider name for Gemini across API request/response payloads +- [ ] Legacy aliases still accepted for one release window +- [ ] Frontend Zod contracts updated and passing +- [ ] Backend contract tests added/updated diff --git a/docs/issues/2026-05-22-sse-retry-config-centralization.md b/docs/issues/2026-05-22-sse-retry-config-centralization.md new file mode 100644 index 00000000..8f2c6a3c --- /dev/null +++ b/docs/issues/2026-05-22-sse-retry-config-centralization.md @@ -0,0 +1,22 @@ +# [Jules][refactor][frontend][backend] Centralize SSE reconnect/retry configuration and behavior + +Master issue: Master audit 2026-05-22. + +## Problem +Retry/poll/reconnect behavior is duplicated across backend SSE endpoints and frontend stream composables. + +## Evidence +- `backend/app/api/logs.py` (`_SSE_RETRY_MS` and stream loop) +- `backend/app/api/simulation_stream.py` (`_SSE_RETRY_MS`, `_POLL_INTERVAL`) +- `backend/app/api/settings.py` (`_SSE_RETRY_MS`) +- `frontend/src/composables/useEventStream.ts` lines 22-25, 97-112 + +## Scope +- Shared backend SSE config module +- Frontend stream policy module +- Contract tests for reconnect semantics + +## Acceptance Criteria +- [ ] One backend source of truth for SSE retry constants +- [ ] Frontend reconnect policy is configurable and documented +- [ ] SSE behavior tests cover reconnect + stale ticket flows