From f1bffb6c365a90378d6f1837bda516ee6ee1fea6 Mon Sep 17 00:00:00 2001 From: Alan Yang <79916645+alan5543@users.noreply.github.com> Date: Mon, 18 May 2026 15:28:31 -0400 Subject: [PATCH 1/2] fix(web): dismissable stale sync-failure banner (#189) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * chore(bootstrap): add votee EE overlay + private-repo CodeQL workaround Bootstraps votee/beever-atlas-ee on top of the OSS fork (Beever-AI/beever-atlas main @ 947b17d). The full 575-commit OSS history is preserved as the base of this repo; this single commit layers on the enterprise IP that lived only in votee's previous fork. ## Votee-only paths added (additive overlay) .claude/ OpenSpec slash-commands + skills .github/workflows/deploy.yml AWS EC2 production deploy .github/workflows/trigger-docs-rebuild.yml docs-site dispatch docs/Beever_Atlas_Feature_Spec.docx feature spec docs/qa/ QA + tool-audit notes docs/v1-archive/ v1 architecture archive docs/v2/ v2 architecture docs openspec/ 7 change proposals (m1, m2, RES-177, multi-workspace, messages-tab, OSS CLA, ingestion) scripts/deploy/ AWS EC2 bootstrap/provision ## CodeQL workflow patch .github/workflows/codeql.yml add `upload: never` votee/beever-atlas-ee is a PRIVATE repo without GitHub Advanced Security, so the OSS-default SARIF upload fails with "Code Security must be enabled for this repository" and blocks CI. The CodeQL queries still run cleanly; only the upload is skipped. Remove `upload: never` if/when GHAS is purchased for this repo. ## Votee paths intentionally DROPPED (superseded by OSS) bot/.eslintrc.json -> bot/eslint.config.js (flat) web/.../graph/GraphTab.tsx -> GraphCanvas + GraphFilters web/.../settings/AgentModelRow.tsx -> AgentModelsTab.tsx web/.../settings/AgentModelSettings.tsx -> AgentModelsTab.tsx web/src/hooks/useAgentModels.ts -> AgentModelsTab.tsx These had been refactored upstream in OSS (LiteLLM endpoint-catalog work + graph component split). Keeping votee's older variants would re-introduce diverged code paths. ## Origin / upstream relationship upstream https://github.com/Beever-AI/beever-atlas (OSS, public) origin https://github.com/votee/beever-atlas-ee (this repo, private) To sync future OSS changes: git fetch upstream git merge upstream/main # resolve conflicts in overlay paths if any git push origin main ## AWS deployment The existing AWS EC2 instance at 18-118-108-191.nip.io runs the OLD votee/beever-atlas. A parallel deployment of this -ee repo will be stood up; once validated, the old deployment is retired. Constraint: votee/beever-atlas-ee is private and has no GHAS Constraint: must preserve OSS commit history for upstream-merge workflow Rejected: hand-built merge of OSS into votee/beever-atlas | unrelated histories, produced PR #81 (closed) — was unmaintainable for ongoing sync Confidence: high Scope-risk: narrow — single bootstrap commit on fresh repo Directive: future OSS syncs use `git merge upstream/main`, NOT hand-built commit-tree; if conflicts touch overlay paths, prefer keeping the votee overlay version unless OSS has actively superseded the file Co-Authored-By: Claude Opus 4.7 (1M context) * chore(deploy): silence auto-deploy + parameterize NAME for parallel EE deploy Two fixes that unblock the next step (provisioning a NEW EC2 instance for the EE deployment, side-by-side with the existing votee/beever-atlas one): ## .github/workflows/deploy.yml — disable push trigger The deploy job fired on every push and failed at "Setup SSH" because the EC2_SSH_KEY + EC2_HOST repo secrets aren't set yet on the fresh ee repo. Restrict to workflow_dispatch only until those secrets are configured; restore `push: branches: [main]` once the new EC2 is up and secrets land. ## scripts/deploy/*.sh — NAME-overridable Hardcoded `beever-atlas` as the AWS resource prefix would collide with the existing votee/beever-atlas deployment (same keypair + security group names in the same AWS account). Parameterized via: NAME="${NAME:-beever-atlas}" # default keeps legacy behaviour KEY_NAME="${NAME}-key" SG_NAME="${NAME}-sg" So the EE side-by-side deploy is: NAME=beever-atlas-ee bash scripts/deploy/deploy.sh The old votee deploy keeps working as before (default NAME unchanged). Server-side path `/opt/beever-atlas-v2` left as-is — there's only one app per EC2 instance, so no collision. Constraint: must not break the legacy votee/beever-atlas deploy Confidence: high Scope-risk: narrow — env-var override with backwards-compatible default Directive: when retiring votee/beever-atlas, also run `NAME=beever-atlas bash scripts/deploy/destroy.sh` to clean up the legacy AWS resources Co-Authored-By: Claude Opus 4.7 (1M context) * fix(deploy): use Docker Hub mirror for Weaviate (cr.weaviate.io is offline) The Weaviate-hosted container registry cr.weaviate.io has been unreachable from us-east-2 (and elsewhere) since at least 2026-05-14, blocking the initial EE deployment. The image content is identical on Docker Hub at `semitechnologies/weaviate:1.28.0` — switching the registry prefix unblocks the deploy. The original SHA256 digest (`58b576d3...`) was pinned to the cr.weaviate.io manifest. Docker Hub serves a different manifest digest for the same content, so the pin is dropped for now. Restore the pinned cr.weaviate.io form once that registry is back up. Constraint: cr.weaviate.io DNS resolves but all 3 IPs (54.244.195.224, 34.213.189.139, 52.33.86.107) return "connection refused" on :443 Rejected: wait for upstream registry | indefinite outage, blocks EE bring-up Rejected: copy the image to a private ECR | overkill for an internal demo Confidence: high — same image, same tag, just different registry Scope-risk: narrow — single image, only affects the Weaviate service Directive: when cr.weaviate.io is back, restore the original `cr.weaviate.io/semitechnologies/weaviate:1.28.0@sha256:...` line to preserve digest-pin defense-in-depth Co-Authored-By: Claude Opus 4.7 (1M context) * ci(deploy): re-enable push trigger now that EE EC2 + secrets are live The new EE EC2 instance (3.134.230.101 at https://3-134-230-101.nip.io) is provisioned, the docker-compose stack is healthy, and the EC2_HOST + EC2_SSH_KEY secrets are configured on the votee/beever-atlas-ee repo. Restoring `on: push: branches: [main]` so subsequent pushes deploy automatically. This commit itself exercises the pipeline end-to-end. Confidence: high — manual deploy already verified ALL_HEALTHY Scope-risk: narrow — single trigger restore Co-Authored-By: Claude Opus 4.7 (1M context) * fix(deploy): restore cr.weaviate.io digest pin for Supply Chain CI The earlier `157c389` ("use Docker Hub mirror for Weaviate") dropped the SHA digest pin while routing around a cr.weaviate.io outage on 2026-05-14. That triggered the CI / Supply Chain (digest pinning) job to fail every push: it rejects any `image:` reference not pinned via `@sha256:`. cr.weaviate.io is back online as of 2026-05-16, and a probe of the Docker Hub `semitechnologies/weaviate:1.28.0` multi-arch manifest shows it shares the exact same digest the cr.weaviate.io image was originally pinned to (`sha256:58b576d3...`). So restoring the OSS-aligned line is strictly safe — same image, same digest, just a registry that the supply-chain check accepts. Constraint: Supply Chain job requires every `image:` to carry an `@sha256:` digest Confidence: high — verified the digest matches across both registries Scope-risk: narrow — single line in docker-compose.yml Co-Authored-By: Claude Opus 4.7 (1M context) * test(web): de-flake AgentModelsTab toast assertion on slow CI runners The "clicking a preset card calls applyPreset and shows the diff toast" test was failing on ee CI with: TestingLibraryElementError: Unable to find an element with the text: /Applied 'Gemini balanced' — 1 updated/ Root cause: useToast auto-dismisses info toasts after INFO_TTL_MS=2500ms. On slow CI runners the test's initial render + fetch resolution can push the first waitFor poll past the 2500ms window, so the toast has already self-dismissed when the assertion runs. Fix: query by role="status" (ToastViewport wraps each toast in
), then regex-match textContent. This is more robust: - Doesn't depend on textContent being a single text node - Re-checks each poll so it tolerates the brief render → dismiss flicker - Survives whitespace / em-dash formatting drift - 50ms interval ensures we catch the toast inside its 2500ms TTL window No runtime / component changes. Test-only fix. Constraint: don't bump INFO_TTL_MS or the on-screen toast lingers longer for real users Confidence: high — the role + textContent pattern is the testing-library recommended workaround for "text broken up by multiple elements" Scope-risk: narrow — single test assertion swap Co-Authored-By: Claude Opus 4.7 (1M context) * fix(web): route chat-image proxy through /api/files/proxy instead of /api/media/proxy Chat answer images (Mattermost / Slack-bot-gated files) were rendering as broken images in the Ask tab. Clicking through landed on Mattermost's 401 page (`api.context.session_expired.app_error`). Root cause: `mediaProxyPathFor()` returned `/api/media/proxy?url=...`, which is the signed-loader-token endpoint. On this deployment `LOADER_TOKEN_SECRET` is empty, so the signed-token validator falls back to a path that doesn't resolve the platform_connection's bot credential. Backend returns `502 Upstream returned 401`. The `` in `MarkdownImage` then errors and the link wrapper opens the raw Mattermost URL — which the browser has no Mattermost cookie for, so it 401s a second time. Backend has a second, working proxy endpoint at `/api/files/proxy` which runs through the `BEEVER_LOADER_RAW_KEY_FALLBACK=true` raw-key path. That endpoint is verified working (HTTP 200, returns file bytes) and is already used by the wiki view (`filesProxyPathFor`). Switch `mediaProxyPathFor` to route to `/api/files/proxy` so chat-side callers (MarkdownImage, SourceCard, InlineMedia, proxiedMediaUrl) reuse the proven endpoint. Wiki-side callers (`filesProxyPathFor`) unchanged. Verified by direct probe on the live EE deployment: GET /api/files/proxy?url= -> 200, 38 MB MP4 body GET /api/media/proxy?url= -> 502, "Upstream returned 401" EE-side patch only — upstream OSS still emits `/api/media/proxy`. Once the signed-token credential resolver is wired through the Mattermost adapter, this can revert to the original endpoint. Constraint: don't touch the backend — the working endpoint already exists, just route the frontend to it Confidence: high — direct curl probe of both endpoints proves the swap Scope-risk: narrow — single helper function + matching unit test, no rendering logic changed Co-Authored-By: Claude Opus 4.7 (1M context) * fix(bot): self-heal from chat-adapter-mattermost leak via scheduled recycle + restart policy (RES-286) (#185) (#21) * fix(bot): self-heal from chat-adapter-mattermost leak via scheduled recycle + restart policy (RES-286) The Mattermost connection on the live EE deployment kept "going down" because the bot container was OOM-killed and never restarted. Two compounding issues: 1. `chat-adapter-mattermost@1.1.2` leaks ~37 MB/h via its long-lived WebSocket handler closures and an unbounded `mattermostUserCache` in bridge.ts. After ~19 h the bot's RSS crosses the host's free-memory headroom (700 MiB on a t4g.medium) and the kernel OOM-killer selects it as the highest-RSS process. 2. The bot service had `restart: no` and no `mem_limit`, so the kill was silent and required manual `docker start` every time. Structural fix (this commit): - `bot/src/chat-manager.ts` — new `scheduleAdapterRecycle(intervalMs)` / `stopAdapterRecycle()` methods. The timer calls `rebuild()` every 6 h to drop accumulated adapter state. Re-entry is guarded via `transitioning`, and the existing `WebhookBuffer` covers the ~1 s rebuild window so callers see no degradation. Six unit tests cover happy path, no-adapter early-return, transitioning guard, disable (interval ≤ 0), idempotent re-schedule, and stop. - `bot/src/bridge.ts` — export `clearMattermostUserCache()` and hook it into the existing `onRebuild` listener alongside `clearBridgeCache()`. The module- level Map at bridge.ts:1585 had no eviction path; it's now cleared on every recycle and on adapter re-registration. - `bot/src/index.ts` — wire `chatManager.scheduleAdapterRecycle()` from the `ADAPTER_RECYCLE_INTERVAL_MS` env (default 6 h, 0 disables). Enrich `/health` with `memory: process.memoryUsage()`, `uptime_seconds`, and return 503 while transitioning so the Docker healthcheck reflects real liveness. Compress startup retry delays from `[1,2,4,8,16]s` (31 s worst case) to `[0.5,1,2,4,4]s` (11.5 s) so restart blast radius is shorter. Safety net (compose): - `docker-compose.yml` — `restart: unless-stopped`, `mem_limit: 768m`, `memswap_limit: 768m`, `NODE_OPTIONS=--max-old-space-size=512` (so V8 GCs aggressively before the cgroup line, leaving room for graceful shutdown), and `start_period: 45s` on the existing healthcheck to accommodate the startup retry window. Even if the leak ever exceeds 2× expectation (~440 MB peak inside the 6 h cycle), the bot self-restarts in seconds. Feature gap (the QA-reported `tech-studio` doesn't appear): - `web/src/components/settings/ManageChannelsDialog.tsx` — Refresh button in the dialog header wired to the existing `useConnectionChannels.refetch`. After this and the live bot, a user adding the bot to a new MM channel sees it surface within seconds without operator intervention. Bot tests: 173 / 173 pass. Web tests: 531 / 531 pass. Constraint: t4g.medium has only 4 GiB RAM, no swap, and 6 hot containers Constraint: chat-adapter-mattermost@1.1.2 is the latest published version (npm versions confirms) — no upstream upgrade available Rejected: Forking chat-adapter-mattermost to patch the leak | high maintenance drag for a pilot; scheduled recycle gets us 95% of the value Rejected: Bot WS subscription to `channel_member_joined` events | the adapter doesn't surface those events publicly; requires forking Rejected: `restart: on-failure` | won't restart after clean SIGTERM during deploys Rejected: REST enumeration of non-member channels | needs `list_team_channels` permission on the customer's MM bot; defer until requested Directive: Do NOT bump `--max-old-space-size` above 512 unless `mem_limit` moves in lockstep — cgroup SIGKILL pre-empts V8 GC otherwise Directive: If `chat-adapter-mattermost` ever ships >1.1.2, re-evaluate whether the scheduled recycle is still needed Confidence: high Scope-risk: moderate Not-tested: 24h drift test against a real Mattermost workspace (requires the live EE deployment; verified via OOM math + unit tests only) * fix(bot): broaden cross-platform leak protection + address review feedback (RES-286) Round-2 changes after OMC code-reviewer + security-reviewer passes against the initial RES-286 fix. All three reviewer findings addressed. **Cross-platform leak protection (audit found 4 more module-level caches):** - `bot/src/bridge.ts` — `clearUserProfileCache()` exported. Same shape as `clearMattermostUserCache` but covers the cross-platform user-profile Map at line 339. Wired into the existing `onRebuild` listener. - `bot/src/bridge.ts` — `pruneStaleTeamsConversations(maxAgeMs)` / `pruneStaleTelegramChats(maxAgeMs)` exported. These two registries are the ONLY source of truth for `listChannels()` on those platforms (populated from inbound webhooks; no list API exists), so we age out entries older than 30 days on every recycle rather than wholesale- clearing. Logs the prune count when non-zero. - `bot/src/bridge.ts` — `onRebuild` listener now: clearBridgeCache + clearMattermostUserCache + clearUserProfileCache + pruneStaleTeams + pruneStaleTelegram. Comment explains why some clear and others prune. **Reviewer feedback addressed:** - `bot/src/chat-manager.ts` — circuit breaker on `scheduleAdapterRecycle`. After `RECYCLE_FAILURE_LIMIT` (3) consecutive failures the timer halts and logs a structured error pointing to investigation. A successful rebuild resets the counter, so flaky-but-recovering states don't trip it. Addresses code-reviewer MEDIUM #1. - `bot/src/index.ts` — `ADAPTER_RECYCLE_INTERVAL_MS` now has a 60-s floor for positive values; `=== 0` still disables. Prevents a misconfigured env from thrashing the websocket. Addresses security-reviewer LOW #2. - `bot/src/index.ts` — `/health` endpoint annotated with a SECURITY block documenting that it's bound to 127.0.0.1 and exposes process metrics; flags the gate to add if the port is ever exposed publicly. Addresses security-reviewer LOW #3. - `src/beever_atlas/api/connections.py` — `@limiter.limit("20/minute")` + `request: Request` on `list_connection_channels`. Prevents a runaway Refresh-button client from exhausting the bot token's upstream rate limit on Mattermost/Slack APIs. Addresses security-reviewer MEDIUM #1. **Tests:** - `bot/src/chat-manager.test.ts` — 4 new tests covering: swallow-rebuild- errors-without-stopping, circuit breaker trips at RECYCLE_FAILURE_LIMIT, counter resets on success (so flaky rebuilds don't trip). - `bot/src/bridge.caches.test.ts` — new file covering prune semantics for Teams + Telegram registries (empty registry returns 0, fresh entries not pruned within window, empty buckets removed after prune), and idempotency of the cache clears. Bot: 183/183 tests pass (up from 173). Python: 86/86 connection tests pass. Constraint: Teams + Telegram `listChannels()` rely on registry entries populated by webhooks — wholesale clearing would empty the sidebar until each conversation posts again Rejected: Wholesale-clear Teams/Telegram registries on recycle | breaks `listChannels` until users re-engage; prune is the correct shape Rejected: Make `/health` auth-gated now | bot port is 127.0.0.1 only, so info disclosure is theoretical; SECURITY comment is the right cost/benefit Rejected: Stricter rate limit than 20/minute | manual Refresh + page-load spikes could legitimately hit 5-10/minute per user; 20 leaves margin without being a DoS lever Directive: `RECYCLE_FAILURE_LIMIT` should stay at 3 — set higher and the log noise the breaker exists to prevent comes back; set lower and a single network blip can disable recycle for the whole bot lifetime Confidence: high Scope-risk: narrow Not-tested: long-running 24h drift with Teams + Telegram traffic (the prune path is exercised only on a real recycle every 6 h) --------- Co-authored-by: Claude Opus 4.7 (1M context) * fix(ux+observability): TTL ring buffer, orphan platform fallback, Mermaid cancellation guard (#186) (#22) Three small, independent QA-reported bugs bundled into one PR. Each was investigated by a separate OMC agent and the consolidated plan was stress- tested by an OMC critic before any code was written (the critic caught a missed sibling fix-site and an underspecified cancellation pattern, both addressed below). **RES-284 — Agent Models tab "many errors" after sync** The bug: stale `ok=false` entries from a previous failure burst stayed in the in-process LiteLLM ring buffer (`deque(maxlen=50)`) until 50 newer calls evicted them or the process restarted — sometimes painting the Agent Models tab red for hours after the underlying cause was fixed. Fix: - `src/beever_atlas/services/llm_call_log.py` — entries are now stored as `(time.time(), call)` tuples so `snapshot()` can age them out. Default TTL is 30 min (matches typical "investigate, fix, verify" loop). Pass a negative value to disable filtering for operator debugging. - `src/beever_atlas/api/llm_debug.py` — the debug endpoint exposes `?max_age_seconds=N` so operators can inspect older entries on demand. - Server-only filter (no client-side double-filter) per the critic's feedback — using two different time sources (Python `time.time()` vs JS `Date.parse`) for the same TTL would be a maintenance trap. - 4 new tests in `tests/services/test_llm_call_log.py`: default-TTL keeps recent, TTL filters old, boundary (just inside vs just outside), negative TTL disables filtering. **RES-287/4a — "Ungrouped (Discord)" mislabel on Mattermost workspace** The bug: orphan channels (no `connection_id`, e.g. CSV-imported or pre- connection-model legacy) used to fall back to `platform="discord"` server-side. The FE sidebar then rendered the Discord icon next to "Ungrouped" on a Mattermost workspace. Fix: - `src/beever_atlas/api/channels.py:514, 667` — `or "discord"` → `or "unknown"`. `PlatformIcon` already falls back to the neutral `MessageSquare` icon for unknown platforms. - `src/beever_atlas/agents/tools/_citation_decorator.py:188` (caught by the critic — missed by the original investigation) — same fix for the sibling site that derived `slack:channel:ts:fact_id` native-identity strings for orphan channel-message items. The permalink resolver already returns `None` for unknown platforms, so no broken Slack URLs get constructed. - 4 new tests in `tests/test_orphan_platform_fallback.py` covering the detector contract (returns None on arbitrary strings, still works for legit Slack/Discord shapes) and both fallback sites. **RES-287/4b — Stacked "Syntax error in text" tiles on wiki page** The bug: when LLM-generated wiki content contained a malformed mermaid block, the page rendered multiple identical "Diagram could not be rendered" fallback tiles. Root cause: React StrictMode double-invokes `useEffect` in dev. The wiki `MermaidBlock` had NO cleanup function, so both mount cycles raced two concurrent `mermaid.render()` coroutines against the singleton mermaid instance, the second of which produced an error SVG → `setError(...)` → fallback tile per block. Fix: - `web/src/components/wiki/MermaidBlock.tsx` — adopt the canonical cancellation pattern from the sibling `channel/MermaidBlock.tsx` (lines 129-184): `let cancelled` flag, `setTimeout` debounce, `clearTimeout` cleanup, `mermaid.parse()` validation before `mermaid.render()`, and every `setSvg`/`setError` call guarded by `if (!cancelled)`. The critic specifically called out matching this pattern verbatim instead of an incomplete snippet. - 5 new vitest tests in `web/src/components/wiki/__tests__/MermaidBlock.test.tsx`: happy path, single-fallback-only on invalid chart, two blocks produce two fallbacks (not four — regression test for the stacking symptom), StrictMode double-mount produces single fallback, mermaid v11 error-SVG fallback. **LLM prompt changes:** none needed. The wiki prompts in `src/beever_atlas/wiki/prompts.py` already restrict mermaid to `graph TD`/`flowchart`; the architect agent verified no `sequenceDiagram`/`erDiagram`/`gantt`/`pie` is ever requested. **Tests:** Python 58 / 58 in affected areas. Web 536 / 536 across all 60 vitest files. TypeScript clean. Lint clean (warnings pre-existing). Constraint: Teams and Telegram listChannels() rely on registry entries populated by webhooks; the ring-buffer TTL pattern doesn't apply there (those are not failure logs) Rejected: Client-side TTL filter in useRecentLLMCalls.ts | server filter is sufficient; two time sources would diverge under clock skew Rejected: Wholesale clearing the user-profile cache on every recycle | already wired in via RES-286 — no new code needed Rejected: LLM prompt constraints on diagram types | prompts already restrict to graph TD/flowchart per audit Directive: When extending /api/settings/debug/recent-llm-calls, keep the default TTL at 30 min — operator-debugging needs are served by ?max_age_seconds, not by changing the default Directive: If a future mermaid version exposes a true async-cancellation API, refactor MermaidBlock to use it instead of the cancelled-flag pattern Confidence: high Scope-risk: narrow Co-authored-by: Claude Opus 4.7 (1M context) * fix(web): channel-sync state isolation + top-nav gate during sync (RES-285) (#187) (#23) Two bundled UX bugs the QA hit while syncing Mattermost channels. Designed via a full ralplan consensus loop (Planner → Architect → Critic); 2 iterations to APPROVE — the substantive shift in iter 2 was flipping Bug A from a local useEffect reset to a route-level remount key, which fixes FIVE state-leak vectors instead of one. ## Bug A — Cross-channel state leak Starting a sync on #marketing and switching to #tech-beever-atlas left the previous channel's progress bar visible (and worse, could permanently freeze on the new channel via the lastFingerprintRef dedup guard). Root cause: the route had no `key` prop, so React Router reused the same ChannelWorkspace instance across :id changes. ALL useState cells in it — syncState, channel, cooldownRemaining, refreshing, loadingChannel — persisted across channel nav. Fix: new `web/src/routes/ChannelWorkspaceRoute.tsx` — thin wrapper that reads :id via useParams and mounts . The React key change forces an atomic unmount + remount of the entire subtree, so every state cell resets. `App.tsx:88` now routes to the wrapper. This is structurally correct: channel-scoped state IS keyed to the channel, not the component instance. Fixes the visible progress-bar leak AND four sibling leaks for free (cooldown countdown carry-over, stale channel.name flash on switch, etc.). ## Bug B — Top-nav not gated during sync Top-nav tabs (Home, Channels, Ask, Activity, Settings) remained clickable while a sync was running, even though leaving mid-sync drops MM ws events. Channel-list switching in the sidebar should stay enabled (channels are isolated now per Bug A); only the top-nav needs the gate. Fix: new `web/src/contexts/SyncStatusContext.tsx` mirroring the existing AskSessionsContext.tsx pattern. Splits state into TWO useState cells — `isSyncRunning: boolean` and `channelId: string|null` — so React's Object.is bail-out keeps subscribers from re-rendering when publish values are equal (a single-object setState would defeat this; the architect+critic specifically demanded the split). Publisher in ChannelWorkspace.tsx: a useEffect keyed on `syncState.state` (string, NOT the whole syncState object — prevents per-poll thrash) publishes the narrowed boolean, plus a cleanup that resets the gate to false on unmount. Subscriber in Sidebar.tsx gates the 4 top-nav NavLinks (Home EXCLUDED — universal escape hatch, intentional invariant from the Home-trap design decision). Triple-defense gate: aria-disabled (a11y), tabIndex={-1} (keyboard tab skip), onClick preventDefault (click + Enter no-op). Tooltip names the syncing channel. The mobile-sheet onClose handler at Sidebar.tsx:149 is preserved via merged onClick. Gate fires ONLY on `state === "syncing"`. NOT on error (terminal — user needs Settings to recover; gating Settings would trap them) or idle/completed. ## Tests - 5 SyncStatusContext tests (default value, throws-outside-provider, AC6 render-count discipline, setter stability, channelId carries) - 3 ChannelWorkspaceRoute tests (renders for current :id, unmount + remount on :id change via useNavigate, returns null without :id) - 544 / 544 web tests pass (was 536). TypeScript clean. 0 lint errors. The single most important regression test is the AC6 guard: publishing an already-equal boolean to the context does NOT re-render subscribers. If a future refactor wraps isSyncRunning+channelId back into a single object setState, that test fails noisily — preventing accidentally re-introducing the publisher thrash this design avoids. Constraint: ChannelWorkspace state is keyed to component instance, not channel — wrong without the route key Constraint: Sidebar is a sibling subtree from ChannelWorkspace — useSync hook cannot be subscribed twice Constraint: Zustand and TanStack Query are not in the web dep graph Rejected: useEffect synchronous reset in useSync.ts | symptom fix; patches one of five leaking state cells, leaves four sibling leaks (cooldown, channel.name, etc.) in place Rejected: Single-object useState({isSyncRunning, channelId}) | breaks AC6 — fresh object literal every publish defeats Object.is bail-out, consumers re-render on identical publishes Rejected: Gate on pipelineActive (any non-idle state) | traps user away from Settings during error state with no recovery path Rejected: Gate Home too | universal escape hatch principle; user must always be able to reach the dashboard Rejected: Zustand store for sync state | not in dep graph; the React Context pattern is already used by AskSessionsContext Directive: If a future state cell is added to ChannelWorkspace, it automatically resets on channel switch — no per-cell reset code needed (this is the "structural cause fix" property) Directive: Publisher useEffect dep array MUST be primitives only. DO NOT add syncState (the whole object) — would thrash on every poll tick Confidence: high Scope-risk: narrow Not-tested: modifier-click (Cmd/Ctrl+click) on a gated NavLink opens in a new tab and bypasses the gate visually — accepted as new-tab boots a fresh app context with no state leak Co-authored-by: Claude Opus 4.7 (1M context) * fix(web): RES-285 follow-ups — sidebar indicator, collapsed monitor, wiki almost-ready, mermaid orphan reaper (#188) * fix(web): RES-285 follow-ups — sidebar sync indicator, collapsed monitor default, wiki "almost ready", mermaid orphan reaper Five small UX/correctness follow-ups discovered during QA of RES-285 (PR #187). All five share the SyncStatusContext + ChannelWorkspaceRoute infrastructure that PR landed, so they ship as a single bundle. ## 1. Sidebar sync indicator (WorkspaceGroup.tsx) Subscribers to `useSyncStatus()` now paint a pulsing primary-color dot (replacing the wiki-state icon) + bold channel name + "Syncing now…" tooltip on whichever row matches `syncingChannelId`. Closes the "top-nav greyed out but I don't know WHICH channel" gap. ## 2. Sync monitor collapsed by default (ChannelWorkspace.tsx) `monitorCollapsed` initial state flipped from `false` to `true`. The existing `localStorage` key now treats anything-other-than-explicit- "false" as collapsed — new users get the compact view, anyone who previously clicked Expand keeps that preference. SyncProgressV2's existing Expand button is the affordance. ## 3. Wiki tab "Wiki will start shortly" state (WikiTab.tsx) New empty-state branch when (a) sync + extraction are done (hasMemories=true), (b) `overview_wiki.state === "pending"`, (c) `wiki_maintenance.done === 0`. Previously the user saw "No Wiki Yet" + Generate button even though the AutoOverviewSubscriber was about to fire — misleading. Now: "Wiki will start shortly — auto- overview is queued. You can click Generate to start it now." Narrowed on `pending` specifically (NOT undefined) so legacy/feature- flag-off backends correctly fall through to the original CTA. ## 4. Publisher widening (ChannelWorkspace.tsx) RES-285's publisher only fired on `syncState.state === "syncing"`, but `useSync.ts:300-304` shows the backend can return `state: "idle"` with phases `in_flight` (the "warming up" window after dispatch). My narrow check missed that window, so the top-nav gate AND the new sidebar indicator never lit up. Widened to fire on `state === "syncing" || anyPhaseInFlight`. Still excludes `error` per the ralplan decision. ## 5. Mermaid orphan-DOM reaper (wiki/MermaidBlock.tsx + channel/MermaidBlock.tsx) RES-287/4b's cancellation guard handled the React state side of the StrictMode race, but missed that mermaid v11 leaves a temp `
` in `document.body` after parse failures. Those orphan divs render the bomb-emoji "Syntax error in text" SVG at the bottom of the page — visible OUTSIDE any React boundary. Reaper tracks every id we ask mermaid to render and removes matching elements (by id + by a textual `Syntax error in text` sweep across direct body children) after every render attempt + on unmount. Applied to both MermaidBlock implementations for consistency. ## Why all in one PR - All five share the SyncStatusContext or RES-287 surface PR #187/186 created. - Total production diff: ~120 LOC; all additive. - None of them are independent bug reports — they're refinements caught during QA hands-on of the already-merged fixes. Tests: 11 / 11 MermaidBlock vitest tests still pass. TypeScript clean. The existing SyncStatusContext + ChannelWorkspaceRoute tests already guard the publisher/subscriber/key-remount paths; the widening at #4 only relaxes the publisher's `true` condition (still narrows strictly to "sync is actually running"), so the AC6 render-count discipline is preserved. Constraint: useSync may return `state: "idle"` with phases in_flight (verified at useSync.ts:300-304) Constraint: Mermaid v11 does NOT reliably clean up temp body divs after parse failures; cancellation guard alone isn't enough — DOM-level cleanup is required Rejected: Single-string `state === "syncing"` for publisher | misses the in_flight-only window where sync is actively running Rejected: Auto-trigger wiki generation after extract | backend AutoOverviewSubscriber already does this when the feature flag is on; FE just needs to communicate "queued" state Rejected: Remove the user's Expand preference on monitorCollapsed flip | breaks anyone who previously clicked Expand Directive: If a future mermaid version exposes a `cleanup(id)` or accepts an explicit container, switch to that and remove the textual body sweep Directive: When adding new state cells to ChannelWorkspace that should reset on channel switch, no code is needed — the route remount key handles it (RES-285 invariant) Confidence: high Scope-risk: narrow Co-Authored-By: Claude Opus 4.7 (1M context) * fix(web): support concurrent channel syncs + global poller for sidebar indicator User-observed gap after the initial RES-285 follow-ups landed: the sidebar indicator only lit up the syncing channel WHILE the user was on its own page. Navigate away and the indicator vanished — defeating the point of an at-a-glance "what's syncing?" signal. The architectural fix is to move the sync-status state from per-channel publisher to a global tracker that the Provider itself maintains, so the signal survives the channel's own ChannelWorkspace unmounting. Also forward-fixes a missed architectural requirement: the system must support MULTIPLE concurrent syncs across channels. The previous `channelId: string | null` design assumed at most one at a time; swapped for `syncingChannels: Set` so the FE no longer constrains backend concurrency. ## Changes `web/src/contexts/SyncStatusContext.tsx`: - State shape: `syncingChannels: ReadonlySet` (was a single id). Consumers derive `isSyncRunning = size > 0` and per-channel checks via `.has()`. - Public API: `claim(channelId)` and `release(channelId)` — both idempotent, both referentially stable via `useCallback([])`. Set identity is preserved when claim/release is a no-op so consumers don't re-render unnecessarily (AC6 discipline preserved). - New: background poller `useEffect` that polls `/api/channels/{id}/sync/status` for every tracked id every 5s. Releases ids when the backend reports no active sync. Stops entirely when the set is empty. Survives ChannelWorkspace mount/ unmount across navigation. `web/src/pages/ChannelWorkspace.tsx`: - Publisher protocol: `claim(id)` when sync is running here, `release(id)` otherwise. No unmount cleanup — the Provider's poller is the authoritative release path. Channels' publishers never touch each other's slots, supporting concurrent syncs. `web/src/components/layout/Sidebar.tsx`: - Derives `isSyncRunning = syncingChannels.size > 0`. - Tooltip now reflects count when multiple syncs are active. `web/src/components/channel/WorkspaceGroup.tsx`: - Row indicator: `syncingChannels.has(ch.channel_id)` instead of an equality check against a single id — so every concurrent-sync row lights up, not just one. `web/src/contexts/__tests__/SyncStatusContext.test.tsx`: - Full rewrite for the new API. 7 tests: default empty set; throws outside provider; claim/release lifecycle; multi-channel support (3 concurrent claims, partial release); idempotent claim (no re-render); idempotent release; setter referential stability across renders. Tests: 546 / 546 web tests pass (was 544; 2 net new tests on the SyncStatusContext file). TypeScript clean. Constraint: ChannelWorkspace's publisher unmounts on channel navigation, so it cannot be the authoritative source of "sync ended" Constraint: Backend may relax single-sync constraint in the future; FE state model must not assume one-at-a-time Rejected: Keep `channelId: string | null` and add a "shadow" persisted field for nav survival | doubles state + creates a sync problem between the two cells; Set-based is cleaner Rejected: Single-channel mode toggled by feature flag | speculative generality, adds branching with no current benefit Directive: When adding new state cells to SyncStatusContext, derive from the Set rather than adding parallel state; the Set is the canonical source of truth Confidence: high Scope-risk: narrow Not-tested: the poller's behaviour against a real 404 / channel- deleted response — accepted as a follow-up Co-Authored-By: Claude Opus 4.7 (1M context) * revert(web): drop top-nav gate during sync — sidebar indicator is enough Product decision: gating top-nav links on sync was paternalistic. Locking the user out of Settings / Activity / Channels during a background sync prevents legitimate parallel work and offers no real safety benefit — the bot keeps syncing regardless of which page the user is on. The sidebar row indicator (pulsing dot + bold name on every syncing channel) gives the user the awareness signal they need. They can choose to navigate back to the syncing channel when they want progress detail; the indicator points the way. What stays: - SyncStatusContext (still the source of truth for which channels are syncing; the indicator depends on it) - ChannelWorkspace publisher (claim/release) - Provider's background poller (releases stale ids) - WorkspaceGroup pulsing-dot indicator What's removed from Sidebar.tsx: - `useSyncStatus()` destructuring + `isSyncRunning` derivation - `gateTooltipText` computation - The `gated` flag in the NavLink map - `aria-disabled`, `tabIndex={-1}`, `onClick preventDefault` - The "Sync in progress — wait for completion" tooltip - The disabled visual styling - The Tooltip wrapper for gated rows when sidebar is expanded Tests: 546 / 546 web tests still pass. TypeScript clean. No SyncStatusContext API change — only the Sidebar consumer goes back to its pre-RES-285 simple form. Constraint: User-reported requirement — "we don't need to lock other tabs" Rejected: Soft-gate (visual warning without click prevention) | adds UX inconsistency for no clear benefit; the row indicator already conveys the same info more visibly Directive: Don't reintroduce top-nav gating without explicit product buy-in; the row indicator is the canonical awareness UX Confidence: high Scope-risk: narrow Co-Authored-By: Claude Opus 4.7 (1M context) --------- Co-authored-by: Claude Opus 4.7 (1M context) * fix(web): dismissable stale sync-failure banner Backend returns the last failure on /sync/status until a newer sync succeeds. For channels where the user doesn't want to retry — e.g. the failure is a stale artifact from the RES-286 bot-outage era — the red banner sits visible forever. User-reported: "It's the old record, how to dismiss it?" Adds a per-channel dismiss UX: - X button on the failure banner (NOT the cooldown banner — cooldown is time-bounded informational state) - Dismissal stored in localStorage keyed by channel id: `beever.sync-failure-dismissed.{channel_id}` = "{job_id}|{message}" - Signature uses job_id + first 200 chars of message so a NEW failure (different job_id or different copy) brings the banner back. Same failure (same job_id) stays hidden. - Re-hydrates from storage on channel switch (state survives the ChannelWorkspace remount-key cycle from RES-285). Tests: 546 / 546 web tests pass. TypeScript clean. Constraint: Cooldown messages must remain visible (time-bounded info, not noise) Rejected: Backend auto-clear of stale failure | broader change, also gates on successful sync only — doesn't help channels the user is intentionally NOT resyncing Rejected: Global "dismiss all" toggle | leakier UX; per-channel + per-signature is what the user actually needs Directive: When the failure copy changes (new job_id or new message), the dismiss does NOT carry — a fresh banner appears. This is intentional; don't broaden the signature to "any failure on this channel" or stale dismissals will hide real new problems Confidence: high Scope-risk: narrow Not-tested: localStorage quota-exceeded (silently degrades to in- memory dismissal — acceptable) Co-Authored-By: Claude Opus 4.7 (1M context) --------- Co-authored-by: Claude Opus 4.7 (1M context) --- .claude/commands/opsx/apply.md | 152 ++ .claude/commands/opsx/archive.md | 157 ++ .claude/commands/opsx/explore.md | 173 ++ .claude/commands/opsx/ff.md | 97 + .claude/commands/opsx/propose.md | 106 + .claude/skills/openspec-apply-change/SKILL.md | 156 ++ .../skills/openspec-archive-change/SKILL.md | 114 + .claude/skills/openspec-explore/SKILL.md | 288 +++ .claude/skills/openspec-ff-change/SKILL.md | 101 + .claude/skills/openspec-propose/SKILL.md | 110 + .github/workflows/codeql.yml | 13 +- .github/workflows/deploy.yml | 38 + .github/workflows/trigger-docs-rebuild.yml | 18 + docs/Beever_Atlas_Feature_Spec.docx | Bin 0 -> 35075 bytes docs/qa/tool-audit-2026-Q2.md | 407 ++++ docs/v1-archive/ARCHITECTURE_OVERVIEW.md | 788 ++++++ .../ARCHITECTURE_OVERVIEW_V2_MONOLITH.md | 1288 ++++++++++ docs/v1-archive/PROJECT_ANALYSIS.md | 977 ++++++++ .../v1-archive/RETRIEVAL_IMPROVEMENT_IDEAS.md | 711 ++++++ .../v1-archive/TECHNICAL_PROPOSAL_MONOLITH.md | 2105 +++++++++++++++++ docs/v2/01-architecture-overview.md | 229 ++ docs/v2/02-semantic-memory.md | 273 +++ docs/v2/03-graph-memory.md | 328 +++ docs/v2/04-query-router.md | 270 +++ docs/v2/05-ingestion-pipeline.md | 436 ++++ docs/v2/06-wiki-generation.md | 861 +++++++ docs/v2/07-deployment.md | 247 ++ docs/v2/08-resilience.md | 204 ++ docs/v2/09-observability.md | 125 + docs/v2/10-access-control.md | 91 + docs/v2/11-frontend-design.md | 709 ++++++ docs/v2/12-api-design.md | 492 ++++ docs/v2/13-adk-integration.md | 274 +++ docs/v2/README.md | 53 + docs/v2/current-architecture-diagram.md | 432 ++++ docs/v2/decisions.md | 64 + docs/v2/memory-architecture.md | 273 +++ docs/v2/reference-papers.md | 122 + docs/v2/weakness-resolution-map.md | 586 +++++ .../.openspec.yaml | 2 + .../ingestion-pipeline-hardening/design.md | 123 + .../ingestion-pipeline-hardening/proposal.md | 35 + .../specs/coreference-resolution/spec.md | 34 + .../specs/cross-batch-thread-context/spec.md | 30 + .../specs/multimodal-expansion/spec.md | 60 + .../specs/semantic-entity-dedup/spec.md | 41 + .../specs/semantic-search/spec.md | 30 + .../specs/soft-orphan-handling/spec.md | 41 + .../specs/temporal-fact-lifecycle/spec.md | 34 + .../ingestion-pipeline-hardening/tasks.md | 77 + .../m1-skeleton-health-pulse/.openspec.yaml | 2 + .../m1-skeleton-health-pulse/design.md | 62 + .../m1-skeleton-health-pulse/proposal.md | 36 + .../specs/adk-foundation/spec.md | 38 + .../specs/bot-placeholder/spec.md | 30 + .../specs/frontend-shell/spec.md | 77 + .../specs/health-endpoint/spec.md | 45 + .../specs/memories-browser/spec.md | 62 + .../specs/project-scaffold/spec.md | 59 + .../changes/m1-skeleton-health-pulse/tasks.md | 61 + .../m2-chatbot-echo-query/.openspec.yaml | 2 + .../changes/m2-chatbot-echo-query/design.md | 80 + .../changes/m2-chatbot-echo-query/proposal.md | 32 + .../specs/adk-echo-agent/spec.md | 33 + .../specs/ask-endpoint/spec.md | 48 + .../specs/channel-workspace/spec.md | 59 + .../specs/chat-bot/spec.md | 59 + .../specs/normalized-message/spec.md | 106 + .../changes/m2-chatbot-echo-query/tasks.md | 65 + .../messages-tab-enhancement/.openspec.yaml | 2 + .../messages-tab-enhancement/design.md | 55 + .../messages-tab-enhancement/proposal.md | 31 + .../message-display-enhancements/spec.md | 59 + .../specs/message-filtering/spec.md | 44 + .../specs/message-pagination/spec.md | 45 + .../changes/messages-tab-enhancement/tasks.md | 55 + .../.openspec.yaml | 2 + .../multi-workspace-connections/design.md | 98 + .../multi-workspace-connections/proposal.md | 31 + .../specs/multi-connection-backend/spec.md | 99 + .../specs/multi-connection-bot/spec.md | 124 + .../specs/multi-connection-frontend/spec.md | 61 + .../multi-workspace-connections/tasks.md | 86 + .../.openspec.yaml | 2 + .../oss-cla-copyright-assignment/design.md | 117 + .../oss-cla-copyright-assignment/proposal.md | 57 + .../specs/copyright-posture/spec.md | 80 + .../oss-cla-copyright-assignment/tasks.md | 18 + .../.openspec.yaml | 2 + .../res-177-p0-quality-hardening/design.md | 203 ++ .../res-177-p0-quality-hardening/proposal.md | 115 + .../specs/backend-test-baseline/spec.md | 66 + .../specs/bot-bridge-decomposition/spec.md | 79 + .../specs/bot-dependency-pinning/spec.md | 33 + .../specs/ci-quality-gates/spec.md | 61 + .../specs/container-supply-chain/spec.md | 41 + .../specs/docs-env-hygiene/spec.md | 68 + .../specs/web-test-harness/spec.md | 38 + .../res-177-p0-quality-hardening/tasks.md | 106 + openspec/config.yaml | 1 + scripts/deploy/.gitignore | 1 + scripts/deploy/README.md | 53 + scripts/deploy/bootstrap.sh | 58 + scripts/deploy/deploy.sh | 112 + scripts/deploy/destroy.sh | 41 + scripts/deploy/provision.sh | 133 ++ scripts/deploy/ssh.sh | 9 + scripts/deploy/start.sh | 10 + scripts/deploy/stop.sh | 10 + .../__tests__/AgentModelsTab.test.tsx | 16 +- web/src/pages/ChannelWorkspace.tsx | 73 +- 111 files changed, 17713 insertions(+), 13 deletions(-) create mode 100644 .claude/commands/opsx/apply.md create mode 100644 .claude/commands/opsx/archive.md create mode 100644 .claude/commands/opsx/explore.md create mode 100644 .claude/commands/opsx/ff.md create mode 100644 .claude/commands/opsx/propose.md create mode 100644 .claude/skills/openspec-apply-change/SKILL.md create mode 100644 .claude/skills/openspec-archive-change/SKILL.md create mode 100644 .claude/skills/openspec-explore/SKILL.md create mode 100644 .claude/skills/openspec-ff-change/SKILL.md create mode 100644 .claude/skills/openspec-propose/SKILL.md create mode 100644 .github/workflows/deploy.yml create mode 100644 .github/workflows/trigger-docs-rebuild.yml create mode 100644 docs/Beever_Atlas_Feature_Spec.docx create mode 100644 docs/qa/tool-audit-2026-Q2.md create mode 100644 docs/v1-archive/ARCHITECTURE_OVERVIEW.md create mode 100644 docs/v1-archive/ARCHITECTURE_OVERVIEW_V2_MONOLITH.md create mode 100644 docs/v1-archive/PROJECT_ANALYSIS.md create mode 100644 docs/v1-archive/RETRIEVAL_IMPROVEMENT_IDEAS.md create mode 100644 docs/v1-archive/TECHNICAL_PROPOSAL_MONOLITH.md create mode 100644 docs/v2/01-architecture-overview.md create mode 100644 docs/v2/02-semantic-memory.md create mode 100644 docs/v2/03-graph-memory.md create mode 100644 docs/v2/04-query-router.md create mode 100644 docs/v2/05-ingestion-pipeline.md create mode 100644 docs/v2/06-wiki-generation.md create mode 100644 docs/v2/07-deployment.md create mode 100644 docs/v2/08-resilience.md create mode 100644 docs/v2/09-observability.md create mode 100644 docs/v2/10-access-control.md create mode 100644 docs/v2/11-frontend-design.md create mode 100644 docs/v2/12-api-design.md create mode 100644 docs/v2/13-adk-integration.md create mode 100644 docs/v2/README.md create mode 100644 docs/v2/current-architecture-diagram.md create mode 100644 docs/v2/decisions.md create mode 100644 docs/v2/memory-architecture.md create mode 100644 docs/v2/reference-papers.md create mode 100644 docs/v2/weakness-resolution-map.md create mode 100644 openspec/changes/ingestion-pipeline-hardening/.openspec.yaml create mode 100644 openspec/changes/ingestion-pipeline-hardening/design.md create mode 100644 openspec/changes/ingestion-pipeline-hardening/proposal.md create mode 100644 openspec/changes/ingestion-pipeline-hardening/specs/coreference-resolution/spec.md create mode 100644 openspec/changes/ingestion-pipeline-hardening/specs/cross-batch-thread-context/spec.md create mode 100644 openspec/changes/ingestion-pipeline-hardening/specs/multimodal-expansion/spec.md create mode 100644 openspec/changes/ingestion-pipeline-hardening/specs/semantic-entity-dedup/spec.md create mode 100644 openspec/changes/ingestion-pipeline-hardening/specs/semantic-search/spec.md create mode 100644 openspec/changes/ingestion-pipeline-hardening/specs/soft-orphan-handling/spec.md create mode 100644 openspec/changes/ingestion-pipeline-hardening/specs/temporal-fact-lifecycle/spec.md create mode 100644 openspec/changes/ingestion-pipeline-hardening/tasks.md create mode 100644 openspec/changes/m1-skeleton-health-pulse/.openspec.yaml create mode 100644 openspec/changes/m1-skeleton-health-pulse/design.md create mode 100644 openspec/changes/m1-skeleton-health-pulse/proposal.md create mode 100644 openspec/changes/m1-skeleton-health-pulse/specs/adk-foundation/spec.md create mode 100644 openspec/changes/m1-skeleton-health-pulse/specs/bot-placeholder/spec.md create mode 100644 openspec/changes/m1-skeleton-health-pulse/specs/frontend-shell/spec.md create mode 100644 openspec/changes/m1-skeleton-health-pulse/specs/health-endpoint/spec.md create mode 100644 openspec/changes/m1-skeleton-health-pulse/specs/memories-browser/spec.md create mode 100644 openspec/changes/m1-skeleton-health-pulse/specs/project-scaffold/spec.md create mode 100644 openspec/changes/m1-skeleton-health-pulse/tasks.md create mode 100644 openspec/changes/m2-chatbot-echo-query/.openspec.yaml create mode 100644 openspec/changes/m2-chatbot-echo-query/design.md create mode 100644 openspec/changes/m2-chatbot-echo-query/proposal.md create mode 100644 openspec/changes/m2-chatbot-echo-query/specs/adk-echo-agent/spec.md create mode 100644 openspec/changes/m2-chatbot-echo-query/specs/ask-endpoint/spec.md create mode 100644 openspec/changes/m2-chatbot-echo-query/specs/channel-workspace/spec.md create mode 100644 openspec/changes/m2-chatbot-echo-query/specs/chat-bot/spec.md create mode 100644 openspec/changes/m2-chatbot-echo-query/specs/normalized-message/spec.md create mode 100644 openspec/changes/m2-chatbot-echo-query/tasks.md create mode 100644 openspec/changes/messages-tab-enhancement/.openspec.yaml create mode 100644 openspec/changes/messages-tab-enhancement/design.md create mode 100644 openspec/changes/messages-tab-enhancement/proposal.md create mode 100644 openspec/changes/messages-tab-enhancement/specs/message-display-enhancements/spec.md create mode 100644 openspec/changes/messages-tab-enhancement/specs/message-filtering/spec.md create mode 100644 openspec/changes/messages-tab-enhancement/specs/message-pagination/spec.md create mode 100644 openspec/changes/messages-tab-enhancement/tasks.md create mode 100644 openspec/changes/multi-workspace-connections/.openspec.yaml create mode 100644 openspec/changes/multi-workspace-connections/design.md create mode 100644 openspec/changes/multi-workspace-connections/proposal.md create mode 100644 openspec/changes/multi-workspace-connections/specs/multi-connection-backend/spec.md create mode 100644 openspec/changes/multi-workspace-connections/specs/multi-connection-bot/spec.md create mode 100644 openspec/changes/multi-workspace-connections/specs/multi-connection-frontend/spec.md create mode 100644 openspec/changes/multi-workspace-connections/tasks.md create mode 100644 openspec/changes/oss-cla-copyright-assignment/.openspec.yaml create mode 100644 openspec/changes/oss-cla-copyright-assignment/design.md create mode 100644 openspec/changes/oss-cla-copyright-assignment/proposal.md create mode 100644 openspec/changes/oss-cla-copyright-assignment/specs/copyright-posture/spec.md create mode 100644 openspec/changes/oss-cla-copyright-assignment/tasks.md create mode 100644 openspec/changes/res-177-p0-quality-hardening/.openspec.yaml create mode 100644 openspec/changes/res-177-p0-quality-hardening/design.md create mode 100644 openspec/changes/res-177-p0-quality-hardening/proposal.md create mode 100644 openspec/changes/res-177-p0-quality-hardening/specs/backend-test-baseline/spec.md create mode 100644 openspec/changes/res-177-p0-quality-hardening/specs/bot-bridge-decomposition/spec.md create mode 100644 openspec/changes/res-177-p0-quality-hardening/specs/bot-dependency-pinning/spec.md create mode 100644 openspec/changes/res-177-p0-quality-hardening/specs/ci-quality-gates/spec.md create mode 100644 openspec/changes/res-177-p0-quality-hardening/specs/container-supply-chain/spec.md create mode 100644 openspec/changes/res-177-p0-quality-hardening/specs/docs-env-hygiene/spec.md create mode 100644 openspec/changes/res-177-p0-quality-hardening/specs/web-test-harness/spec.md create mode 100644 openspec/changes/res-177-p0-quality-hardening/tasks.md create mode 100644 openspec/config.yaml create mode 100644 scripts/deploy/.gitignore create mode 100644 scripts/deploy/README.md create mode 100755 scripts/deploy/bootstrap.sh create mode 100755 scripts/deploy/deploy.sh create mode 100755 scripts/deploy/destroy.sh create mode 100755 scripts/deploy/provision.sh create mode 100755 scripts/deploy/ssh.sh create mode 100755 scripts/deploy/start.sh create mode 100755 scripts/deploy/stop.sh diff --git a/.claude/commands/opsx/apply.md b/.claude/commands/opsx/apply.md new file mode 100644 index 00000000..bf23721d --- /dev/null +++ b/.claude/commands/opsx/apply.md @@ -0,0 +1,152 @@ +--- +name: "OPSX: Apply" +description: Implement tasks from an OpenSpec change (Experimental) +category: Workflow +tags: [workflow, artifacts, experimental] +--- + +Implement tasks from an OpenSpec change. + +**Input**: Optionally specify a change name (e.g., `/opsx:apply add-auth`). If omitted, check if it can be inferred from conversation context. If vague or ambiguous you MUST prompt for available changes. + +**Steps** + +1. **Select the change** + + If a name is provided, use it. Otherwise: + - Infer from conversation context if the user mentioned a change + - Auto-select if only one active change exists + - If ambiguous, run `openspec list --json` to get available changes and use the **AskUserQuestion tool** to let the user select + + Always announce: "Using change: " and how to override (e.g., `/opsx:apply `). + +2. **Check status to understand the schema** + ```bash + openspec status --change "" --json + ``` + Parse the JSON to understand: + - `schemaName`: The workflow being used (e.g., "spec-driven") + - Which artifact contains the tasks (typically "tasks" for spec-driven, check status for others) + +3. **Get apply instructions** + + ```bash + openspec instructions apply --change "" --json + ``` + + This returns: + - Context file paths (varies by schema) + - Progress (total, complete, remaining) + - Task list with status + - Dynamic instruction based on current state + + **Handle states:** + - If `state: "blocked"` (missing artifacts): show message, suggest using `/opsx:continue` + - If `state: "all_done"`: congratulate, suggest archive + - Otherwise: proceed to implementation + +4. **Read context files** + + Read the files listed in `contextFiles` from the apply instructions output. + The files depend on the schema being used: + - **spec-driven**: proposal, specs, design, tasks + - Other schemas: follow the contextFiles from CLI output + +5. **Show current progress** + + Display: + - Schema being used + - Progress: "N/M tasks complete" + - Remaining tasks overview + - Dynamic instruction from CLI + +6. **Implement tasks (loop until done or blocked)** + + For each pending task: + - Show which task is being worked on + - Make the code changes required + - Keep changes minimal and focused + - Mark task complete in the tasks file: `- [ ]` → `- [x]` + - Continue to next task + + **Pause if:** + - Task is unclear → ask for clarification + - Implementation reveals a design issue → suggest updating artifacts + - Error or blocker encountered → report and wait for guidance + - User interrupts + +7. **On completion or pause, show status** + + Display: + - Tasks completed this session + - Overall progress: "N/M tasks complete" + - If all done: suggest archive + - If paused: explain why and wait for guidance + +**Output During Implementation** + +``` +## Implementing: (schema: ) + +Working on task 3/7: +[...implementation happening...] +✓ Task complete + +Working on task 4/7: +[...implementation happening...] +✓ Task complete +``` + +**Output On Completion** + +``` +## Implementation Complete + +**Change:** +**Schema:** +**Progress:** 7/7 tasks complete ✓ + +### Completed This Session +- [x] Task 1 +- [x] Task 2 +... + +All tasks complete! You can archive this change with `/opsx:archive`. +``` + +**Output On Pause (Issue Encountered)** + +``` +## Implementation Paused + +**Change:** +**Schema:** +**Progress:** 4/7 tasks complete + +### Issue Encountered + + +**Options:** +1.
{isMember && ( <> - {displayFailureMessage && ( + {displayFailureMessage && !failureDismissed && (
- {isCoolingDown ? displayFailureMessage : `Sync failed: ${displayFailureMessage}`} + + {isCoolingDown ? displayFailureMessage : `Sync failed: ${displayFailureMessage}`} + + {/* Dismiss is only offered for the failure banner — cooldown + timers are time-bounded informational state and shouldn't + hide. Per-channel localStorage means a NEW failure (new + job_id) brings the banner back. */} + {!isCoolingDown && failureSignature && ( + + )}
)} {syncCompletedWithNoNew && ( From e8ea324e0600257f337eb6409819c430642db3d8 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 18 May 2026 20:16:36 +0000 Subject: [PATCH 2/2] chore(deps): bump brace-expansion from 5.0.5 to 5.0.6 in /web Bumps [brace-expansion](https://github.com/juliangruber/brace-expansion) from 5.0.5 to 5.0.6. - [Release notes](https://github.com/juliangruber/brace-expansion/releases) - [Commits](https://github.com/juliangruber/brace-expansion/compare/v5.0.5...v5.0.6) --- updated-dependencies: - dependency-name: brace-expansion dependency-version: 5.0.6 dependency-type: indirect ... Signed-off-by: dependabot[bot] --- web/package-lock.json | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/web/package-lock.json b/web/package-lock.json index 5f701788..20077110 100644 --- a/web/package-lock.json +++ b/web/package-lock.json @@ -3470,9 +3470,9 @@ } }, "node_modules/brace-expansion": { - "version": "5.0.5", - "resolved": "https://registry.npmjs.org/brace-expansion/-/brace-expansion-5.0.5.tgz", - "integrity": "sha512-VZznLgtwhn+Mact9tfiwx64fA9erHH/MCXEUfB/0bX/6Fz6ny5EGTXYltMocqg4xFAQZtnO3DHWWXi8RiuN7cQ==", + "version": "5.0.6", + "resolved": "https://registry.npmjs.org/brace-expansion/-/brace-expansion-5.0.6.tgz", + "integrity": "sha512-kLpxurY4Z4r9sgMsyG0Z9uzsBlgiU/EFKhj/h91/8yHu0edo7XuixOIH3VcJ8kkxs6/jPzoI6U9Vj3WqbMQ94g==", "license": "MIT", "dependencies": { "balanced-match": "^4.0.2"