Skip to content

feat(teams): Graph-based ingestion (no @mention required)#206

Merged
alan5543 merged 6 commits into
mainfrom
feat/teams-graph-ingestion
Jun 1, 2026
Merged

feat(teams): Graph-based ingestion (no @mention required)#206
alan5543 merged 6 commits into
mainfrom
feat/teams-graph-ingestion

Conversation

@alan5543
Copy link
Copy Markdown
Member

Summary

  • Rewires the Teams bridge to be a read-only ingestion adapter on top of Microsoft Graph, matching Slack/Discord/Mattermost — channels and message history are fetched via the platform API without needing the bot to be @mentioned.
  • The root unblocker was a missing await newBot.initialize() in ChatManager.rebuild — the Chat SDK was deferring adapter init until the first inbound webhook, so every bridge-driven Graph read returned empty/403 until someone @mentioned the bot. Everything else stacks on top: Graph channel enumeration, channel-id encoding fixes, channelContext write-back, HTML-entity decode, system/deleted message filter, since/before filtering, backward Graph pagination, MSAL token pre-warm, resilient thread.post, sanitized error logging.
  • Connection must use appType=SingleTenant (MultiTenant triggers MSAL missing_tenant_id_error for client_credentials). The bot's AAD app needs Channel.ReadBasic.All Graph application permission with Global Admin consent (Application Administrator role cannot consent Microsoft Graph app permissions — known Microsoft limit).

Test plan

  • tsc --noEmit clean
  • Bridge unit tests 7/7 pass
  • GET /api/connections/{teams-conn}/channels returns enumerated channels immediately after docker compose restart bot — no @mention required
  • GET /bridge/connections/{conn}/channels/{tech-discussion}/messages returns 4 distinct user messages, clean text (no  )
  • GET /bridge/connections/{conn}/channels/{beever-atlas-test}/messages returns 0 (system/install events filtered)
  • Channel ids passed in percent-encoded form (Teams' :/@) work on legacy, platform-prefixed, and connection-scoped bridge routes
  • since=<future ISO> → 0 messages, since=<past ISO> → all, before=<past> → 0
  • GET …/count agrees with len(messages) for both Teams channels (shared isUserMessage predicate)
  • Per-fetch latency at the Microsoft Graph floor (~1.5s warm, no extra Graph calls in the bridge)
  • Slack/Discord/Mattermost regression: connection counts, channel counts, sample fetches unchanged
  • console.error/warn in the 6 in-scope call sites use safeErrMsg(e) — no raw error objects (which could carry MSAL secrets or Bot-Connector tokens) reach stdout
  • Live @mention reply round-trip — intentionally not tested: Teams is fetch-only by product design; the resilient handler ensures a failed reply doesn't break webhook recording

🤖 Generated with Claude Code

alan5543 and others added 6 commits May 30, 2026 00:33
Teams was @mention-gated and frequently broken. This rewires the Teams
bridge to be a read-only ingestion adapter on top of Microsoft Graph,
matching Slack/Discord/Mattermost where channels and history are
fetched via the platform API without needing the bot to be @mentioned.

Bridge core
- Eager `await newBot.initialize()` in ChatManager.rebuild — root
  unblocker; the SDK was deferring adapter init until the first inbound
  webhook, leaving bridge-driven Graph reads broken.
- TeamsBridge.listChannels enumerates via Graph `/teams/{aadGroupId}/channels`
  with a Redis SCAN cold-start that self-heals (guard only sets on
  success; re-scans when the connection has no known teams).
- Authoritative `{teamId, channelId}` write-back into the adapter's
  `teams:channelContext:*` Redis cache after enumeration — fixes a real
  channel-id-poisoning bug where two channels returned identical messages.
- fetchPage decoupled from `serviceUrl` for channel reads (Graph uses
  team-id/channel-id from getChannelContext, not Bot-Connector serviceUrl).
- getChannel encodes the threadId before fetchChannelInfo (silent
  ValidationError was the cause of the UI's "Channel Not Connected" gate).
- decodeChannelSegment applied at every bridge route capturing a channel-id
  — Teams ids contain `:`/`@`; safe no-op for other platforms.

Message quality
- HTML-entity decode in normalizeMessage (`&nbsp;` etc.) with correct
  order so double-encoded entities stay inert.
- System / deleted / event messages filtered via shared isUserMessage
  predicate (messageType !== "message" OR deletedDateTime).
- getMessageCount uses the same predicate so counts stay consistent
  with getMessages.
- `since` / `before` timestamp filtering honoured.
- Backward Graph pagination — was paging full history then slicing.

Auth + robustness
- Connection must use appType=SingleTenant; MultiTenant breaks MSAL
  client_credentials with `missing_tenant_id_error`.
- Token pre-warm at startup eliminates the ~1.5–2.5s first-request cold
  MSAL acquisition.
- Resilient thread.post in registerHandlers — a failed reply must not
  throw out of the webhook handler (Teams is fetch-only here; outbound
  reply is defensive only).
- safeErrMsg(e) helper used at the 6 in-scope console.error/warn sites
  so raw error objects (which can carry MSAL secrets or short-lived Bot
  Connector tokens) never reach stdout.

Constraint: Microsoft Graph rejects `$select` on /teams/{id}/channels/{id}/messages (HTTP 400 "Query option 'Select' is not allowed") — documented in code so the next engineer doesn't reattempt the optimisation
Constraint: Channel.ReadBasic.All Graph application permission requires Global Administrator consent; Application Administrator role cannot consent Microsoft Graph app permissions
Rejected: $select to trim payload | Graph endpoint does not support it (HTTP 400)
Rejected: cross-connection backfill from shared Redis channelContext keyspace | multi-tenant isolation risk — discovered teamIds stay scoped to the connection that owns them
Directive: Teams here is fetch-only by product design; the bot does not need to reply to @mentions. Do not couple new logic to webhook-driven ingestion — Graph is the source of truth
Confidence: high
Scope-risk: moderate (bridge.ts +345 lines; Slack/Discord/Mattermost paths verified untouched)
Not-tested: live @mention reply round-trip (intentional — Teams is fetch-only; the resilient handler only ensures a failed post doesn't break webhook recording)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Set is mutated via .add() but never reassigned, so eslint's
prefer-const rule rightly flagged the `let` declaration as an error
under the bot's CI lint gate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…gle-adk)

starlette is pulled transitively via fastapi / google-adk / sse-starlette / mcp.
google-adk (even latest 2.1.0) pins `starlette<1.0.0`, and PYSEC-2026-161 is
fixed in starlette 1.0.1 — so the fix is unreachable until google-adk relaxes
its pin. Mirrors the existing PYSEC-2025-183 (pyjwt) handling.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The CI blocker was a pre-existing flaky web test, unrelated to the Teams
changes: "two MermaidBlocks each produce at most one fallback" asserted
toHaveLength(2) immediately after a waitFor that only waited for >= 1, so
under CI load the second block's async parse→retry→fallback cycle hadn't
settled (got 1, expected 2). Wait for exactly 2 instead, which also encodes
the no-stacking invariant; the dedicated StrictMode test stays the canonical
guard against a single block stacking a second tile.

Also addresses two security-review findings on this PR's new Graph code:
- Validate teamIds read from the shared Redis channelContext cache as AAD
  group GUIDs before they reach graph.call(...{ "team-id" }), so a poisoned
  cache entry can't inject an arbitrary value into a Graph API path.
- Use safeErrorMessage() (message-only, no stack/raw object) for the
  channels.list failure log, consistent with the M6 sanitization elsewhere —
  String(err).slice(0,200) could surface an MSAL/Graph error payload.

Constraint: GUID validation applied only to the untrusted Redis-scan path; the
  aadGroupId source is an authenticated Bot Framework activity and stays as-is
Rejected: Validate teamIds at the Graph-call site | would also gate the trusted
  Bot Framework path for no benefit
Confidence: high
Scope-risk: narrow
Not-tested: CI re-run of the deflaked test under real load (only local x3)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…og sites

Follow-up to the security review on this PR. Two improvements:

- Collapse the three duplicate error-sanitizers (safeErrMsg in index.ts and
  chat-manager.ts, plus safeErrorMessage in http-utils) down to the single
  http-utils.safeErrorMessage. It is strictly better than the local copies:
  it extracts .message only, collapses whitespace/newlines, and caps at 200
  chars — the local ones raw-sliced to 500 and could pass multi-line output
  (and a stack-bearing String(err)) straight through.

- Apply safeErrorMessage at every remaining console.error/warn that logged a
  raw err/error object across bridge.ts (18 sites), index.ts (6), and
  chat-manager.ts (3). Previously only the 6 "in-scope" Teams sites were
  sanitized; the rest could still surface an MSAL/Graph/Bot-Connector error
  payload (or a stack) into container logs and aggregators.

Structured fields are preserved where they carried signal: the connection-route
warn still prefers `(err as any)?.data?.error`, now wrapped in safeErrorMessage
so the fallback can't leak. The webhook %s format strings stay static literals
(CodeQL js/tainted-format-string) with the sanitized value passed as an arg.

Constraint: keep the static %s format strings (CodeQL tainted-format-string fix)
Rejected: per-file local helper | drifts — the 500 vs 200 split already happened
Confidence: high
Scope-risk: narrow
Not-tested: live MSAL error payload shape (logging path only, no behavior change)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…le token

Three Teams-only latency wins (no other platform's code path is touched):

1. Cache the slow Graph channel enumeration. TeamsBridge.listChannels split so
   the live webhook registry is still merged FRESH every call, while the
   expensive part — one ~1–1.5s Graph round-trip per installed team, plus the
   Redis cold-start scan — is wrapped in a 60s TTL cache with in-flight dedup
   and stale-on-error fallback. Mirrors the existing DiscordBridge.channelCache
   pattern. Cached only on a non-empty success, so a team discovered later (via
   webhook) is never masked by a cached empty result.

2. Parallelize per-team enumeration. The per-team `GET /teams/{id}/channels`
   calls now run concurrently (Promise.all) instead of sequentially; each
   team's failure is isolated to [] so one bad team can't fail the set.
   teamIds is bounded by install count (a handful), well under Graph throttle.

3. Re-warmable MSAL token. Extracted the one-shot startup pre-warm into an
   exported warmTeamsGraphToken(adapter) and re-fire it on every adapter
   rebuild via onRebuildComplete. A rebuild (connection change or the 6h
   recycle) creates fresh adapters with an empty token cache — exactly when the
   next fetch would otherwise pay the ~1.5–2.5s cold-acquire. The token is
   shared across ALL Graph reads, so this also speeds the first message fetch.

Verified: tsc clean, eslint 0 errors, 187/187 bot tests (4 new for
warmTeamsGraphToken covering the happy path, no-graph no-op, async-reject
swallow, and sync-throw guard). Diff is confined to TeamsBridge + the Teams
token-warm wiring; Slack/Discord/Mattermost/Telegram code is unchanged.

Constraint: registry must stay live — only the Graph enumeration is cached
Constraint: warmTeamsGraphToken must never throw (guards both async + sync)
Rejected: cache the whole merged listChannels result | would delay webhook-
  discovered DMs/channels by up to the TTL
Rejected: bound Promise.all concurrency | install count is tiny; adds no value
Confidence: high
Scope-risk: narrow
Not-tested: live Graph throttling under many installed teams; real MSAL warm timing

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
alan5543 added a commit that referenced this pull request Jun 1, 2026
…UID checks

The wizard's Validate step previously only constructed the Teams adapter
(format check) — a typo'd App ID or wrong secret passed "validation" and
only failed later when channel enumeration silently returned []. Now:

- bridge.ts handleValidateAdapter actually mints a Graph token via MSAL
  and classifies AADSTS / unauthorized_client / invalid_client errors as
  credential failures; non-auth probe failures (403 consent, network)
  soft-accept with a pointer to Channel.ReadBasic.All admin consent.
- ConnectionWizard validates App ID / Tenant ID against the AAD GUID
  shape client-side, renders inline errors, and gates the Validate
  button until they pass.
- Teams is no longer treated as webhook-only: the Channels step renders
  the real Graph-enumerated channel list (depends on #206).

Verified live: a real Teams connection was created through this exact
flow today (channel discovery + message history sync working).

Constraint: validation must not require the messaging endpoint to be live yet
Rejected: server-side-only GUID validation | users deserve inline feedback before a round-trip
Confidence: high
Scope-risk: narrow
Not-tested: MultiTenant validation path against a real multi-org token

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alan5543 alan5543 merged commit dce79d9 into main Jun 1, 2026
9 checks passed
@alan5543 alan5543 deleted the feat/teams-graph-ingestion branch June 1, 2026 04:05
alan5543 added a commit that referenced this pull request Jun 1, 2026
The Connect Microsoft Teams wizard's instructions, credential form,
and post-validation panel previously omitted everything that turns a
real-world Teams setup into a working Beever Atlas connection:

  • The App Type field placeholder was "MultiTenant", which produces
    an MSAL `missing_tenant_id_error` against any tenant-scoped Azure
    Bot. SingleTenant is the supported path, but nothing in the UI
    said so.
  • No mention of the Microsoft Graph `Channel.ReadBasic.All`
    application permission required for the Graph-based channel
    enumeration introduced in #206 — and no warning that ONLY a
    Global Administrator can consent it (Application Administrator
    and Cloud Application Administrator both return
    `Authorization_RequestDenied`).
  • No mention of the messaging endpoint format — users were left
    guessing whether it's `/api/messages`, `/api/teams`, or
    `/api/webhooks/teams`. The bot listens on `/api/teams` (see
    `bot/src/index.ts:422`).
  • No mention of ngrok for local dev, despite the bot needing a
    publicly reachable HTTPS endpoint to receive Bot Framework
    activities.
  • No mention of installing the Teams app package
    (`bot/teams-app/beever-atlas-teams.zip`) — without that step
    the bot never appears in the team.

This change:

  1. Rewrites `TEAMS_INSTRUCTIONS` (8 numbered steps with sub-bullets
     for non-obvious details, ngrok callout, admin-consent gotcha,
     and the .zip install path).
  2. Extends `CredentialField` with optional `enum`, `default`, and
     `hint` so `app_type` renders as a `<select>` (SingleTenant /
     MultiTenant) defaulting to SingleTenant and the
     `app_tenant_id` field carries a "why required" hint. No
     regression for the other four platforms — none of them use the
     new fields.
  3. Replaces the generic Teams branch of `StepWebhookMode` with a
     dedicated `TeamsWebhookMode` panel covering the three concrete
     post-validation steps (endpoint URL, Graph permission, Teams
     app package).
  4. Fixes `bot/README.md` env table to include
     `TEAMS_APP_TENANT_ID` and explain SingleTenant vs MultiTenant.
  5. Fixes `docs/content/getting-started/teams-setup.mdx` env var
     names (the doc previously referenced `TEAMS_TENANT_ID` /
     `TEAMS_CLIENT_ID` / `TEAMS_CLIENT_SECRET`, which are not read
     by the bot) and replaces the historical Graph permissions
     table with the actually-required `Channel.ReadBasic.All`
     (RSC permissions live in the manifest now).

Verified:
- `npx tsc --noEmit` clean
- `npx eslint` on `ConnectionWizard.tsx` clean
- Verifier agent confirmed: credential field key round-trip
  (snake → camel) intact, no regression for slack/discord/
  telegram/mattermost CREDENTIAL_FIELDS entries, `<select>` state
  persists across step navigation.

Stacked on top of #206 because the wizard text describes the Graph
channel enumeration introduced there (`TeamsBridge.listChannels`'s
call to `teams.channels.list`). Retarget to `main` once #206 lands.

Confidence: high
Scope-risk: narrow — UI text + one optional schema extension; no
  backend or bot code paths touched.
Directive: keep the `app_type` enum values in lockstep with the
  Chat SDK adapter's `createTeamsAdapter` accepted values. The
  adapter treats anything other than literal "MultiTenant" as
  SingleTenant — so a typo here doesn't fail loudly, it fails
  silently as a missing-tenant-id MSAL error.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
alan5543 added a commit that referenced this pull request Jun 1, 2026
…UID checks

The wizard's Validate step previously only constructed the Teams adapter
(format check) — a typo'd App ID or wrong secret passed "validation" and
only failed later when channel enumeration silently returned []. Now:

- bridge.ts handleValidateAdapter actually mints a Graph token via MSAL
  and classifies AADSTS / unauthorized_client / invalid_client errors as
  credential failures; non-auth probe failures (403 consent, network)
  soft-accept with a pointer to Channel.ReadBasic.All admin consent.
- ConnectionWizard validates App ID / Tenant ID against the AAD GUID
  shape client-side, renders inline errors, and gates the Validate
  button until they pass.
- Teams is no longer treated as webhook-only: the Channels step renders
  the real Graph-enumerated channel list (depends on #206).

Verified live: a real Teams connection was created through this exact
flow today (channel discovery + message history sync working).

Constraint: validation must not require the messaging endpoint to be live yet
Rejected: server-side-only GUID validation | users deserve inline feedback before a round-trip
Confidence: high
Scope-risk: narrow
Not-tested: MultiTenant validation path against a real multi-org token

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
alan5543 added a commit that referenced this pull request Jun 1, 2026
* docs(teams): accurate setup wizard + env + Graph permissions

The Connect Microsoft Teams wizard's instructions, credential form,
and post-validation panel previously omitted everything that turns a
real-world Teams setup into a working Beever Atlas connection:

  • The App Type field placeholder was "MultiTenant", which produces
    an MSAL `missing_tenant_id_error` against any tenant-scoped Azure
    Bot. SingleTenant is the supported path, but nothing in the UI
    said so.
  • No mention of the Microsoft Graph `Channel.ReadBasic.All`
    application permission required for the Graph-based channel
    enumeration introduced in #206 — and no warning that ONLY a
    Global Administrator can consent it (Application Administrator
    and Cloud Application Administrator both return
    `Authorization_RequestDenied`).
  • No mention of the messaging endpoint format — users were left
    guessing whether it's `/api/messages`, `/api/teams`, or
    `/api/webhooks/teams`. The bot listens on `/api/teams` (see
    `bot/src/index.ts:422`).
  • No mention of ngrok for local dev, despite the bot needing a
    publicly reachable HTTPS endpoint to receive Bot Framework
    activities.
  • No mention of installing the Teams app package
    (`bot/teams-app/beever-atlas-teams.zip`) — without that step
    the bot never appears in the team.

This change:

  1. Rewrites `TEAMS_INSTRUCTIONS` (8 numbered steps with sub-bullets
     for non-obvious details, ngrok callout, admin-consent gotcha,
     and the .zip install path).
  2. Extends `CredentialField` with optional `enum`, `default`, and
     `hint` so `app_type` renders as a `<select>` (SingleTenant /
     MultiTenant) defaulting to SingleTenant and the
     `app_tenant_id` field carries a "why required" hint. No
     regression for the other four platforms — none of them use the
     new fields.
  3. Replaces the generic Teams branch of `StepWebhookMode` with a
     dedicated `TeamsWebhookMode` panel covering the three concrete
     post-validation steps (endpoint URL, Graph permission, Teams
     app package).
  4. Fixes `bot/README.md` env table to include
     `TEAMS_APP_TENANT_ID` and explain SingleTenant vs MultiTenant.
  5. Fixes `docs/content/getting-started/teams-setup.mdx` env var
     names (the doc previously referenced `TEAMS_TENANT_ID` /
     `TEAMS_CLIENT_ID` / `TEAMS_CLIENT_SECRET`, which are not read
     by the bot) and replaces the historical Graph permissions
     table with the actually-required `Channel.ReadBasic.All`
     (RSC permissions live in the manifest now).

Verified:
- `npx tsc --noEmit` clean
- `npx eslint` on `ConnectionWizard.tsx` clean
- Verifier agent confirmed: credential field key round-trip
  (snake → camel) intact, no regression for slack/discord/
  telegram/mattermost CREDENTIAL_FIELDS entries, `<select>` state
  persists across step navigation.

Stacked on top of #206 because the wizard text describes the Graph
channel enumeration introduced there (`TeamsBridge.listChannels`'s
call to `teams.channels.list`). Retarget to `main` once #206 lands.

Confidence: high
Scope-risk: narrow — UI text + one optional schema extension; no
  backend or bot code paths touched.
Directive: keep the `app_type` enum values in lockstep with the
  Chat SDK adapter's `createTeamsAdapter` accepted values. The
  adapter treats anything other than literal "MultiTenant" as
  SingleTenant — so a typo here doesn't fail loudly, it fails
  silently as a missing-tenant-id MSAL error.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(teams-wizard): trim setup steps and webhook panel for scannability

First pass turned every step into a paragraph with parenthetical
rationale and warning callouts; the result rendered as a vertical
wall of prose, monospace sub-bullets full of long sentences, and a
3-card post-validation panel that re-stated info from the setup
list. User feedback: "UX is bad."

This trim:

  • Reduces setup from 8 long steps to 6 single-line imperatives.
    Each step says WHAT to click; rationale ("required so MSAL
    client_credentials can mint the Graph token") is removed —
    users don't need that to follow the click path.
  • Reserves the `details` slot (monospace) for things that should
    actually be monospaced: an enum value, a permission name.
    Prose details made the layout feel like a code listing.
  • Moves the messaging endpoint, ngrok instructions, and Teams
    app `.zip` install OUT of setup and INTO the post-validation
    `TeamsWebhookMode` panel — they happen AFTER credentials
    validate, so putting them in the upfront list both bloated
    setup and skipped the natural workflow break.
  • Drops the redundant Channel.ReadBasic.All card from
    TeamsWebhookMode — it's already in setup step 6, and
    repeating it implied "do this again" rather than "review."
  • TeamsWebhookMode is now 2 cards instead of 3, with shorter
    body copy.

Verified:
- `npx tsc --noEmit` on `web/` clean
- Web image rebuilt and deployed; localhost:3000 returns HTTP 200
- The instruction list now scrolls minimally above the Display
  Name input

Confidence: high
Scope-risk: narrow — UI text only.
Directive: the `details` array on a setup-step instruction renders
  in monospace. Use it for code-shaped values (paths, enum values,
  command snippets), NEVER for prose explanations.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(teams-wizard): add messaging-endpoint step + clarify both app types supported

User-reported gaps after the first trim:
  • The setup steps explained creds/Azure but never told users
    HOW to get a public webhook URL (ngrok) or WHERE to enter the
    bot URL — they were left guessing whether we needed it as a
    wizard field. We don't (bot listens on a fixed path), but the
    setup list must say so.
  • The App Type hint read as "we only support SingleTenant" when
    in fact the select offers both and the SDK accepts either.

Changes:
  • Setup step 2 (new): "Expose this bridge over HTTPS, then set
    the Bot's Messaging endpoint to your URL + /api/teams" with
    mono details for `ngrok http 3001` and the URL pattern. This
    surfaces what was previously buried in the post-validation
    panel.
  • Setup step 1 detail updated to "SingleTenant (recommended) or
    MultiTenant" so the choice is visible upfront.
  • App Type hint rewritten to lead with "Both modes are
    supported" — no longer reads as a restriction.
  • TeamsWebhookMode collapses from 2 cards to 1: just the .zip
    app install. The endpoint card moved to the setup list above;
    keeping it here would have been redundant.

Verified:
- `npx tsc --noEmit` on `web/` clean
- Web image rebuilt and deployed

Confidence: high
Scope-risk: narrow — wizard text only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(teams-wizard): mint real Graph token on Validate + client-side GUID checks

The wizard's Validate step previously only constructed the Teams adapter
(format check) — a typo'd App ID or wrong secret passed "validation" and
only failed later when channel enumeration silently returned []. Now:

- bridge.ts handleValidateAdapter actually mints a Graph token via MSAL
  and classifies AADSTS / unauthorized_client / invalid_client errors as
  credential failures; non-auth probe failures (403 consent, network)
  soft-accept with a pointer to Channel.ReadBasic.All admin consent.
- ConnectionWizard validates App ID / Tenant ID against the AAD GUID
  shape client-side, renders inline errors, and gates the Validate
  button until they pass.
- Teams is no longer treated as webhook-only: the Channels step renders
  the real Graph-enumerated channel list (depends on #206).

Verified live: a real Teams connection was created through this exact
flow today (channel discovery + message history sync working).

Constraint: validation must not require the messaging endpoint to be live yet
Rejected: server-side-only GUID validation | users deserve inline feedback before a round-trip
Confidence: high
Scope-risk: narrow
Not-tested: MultiTenant validation path against a real multi-org token

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(teams-app): point manifest at the real bot app id + align package filename

The loose manifest.json referenced botId eefc03cb-… which does not exist
in any Azure tenant (stale id from an early registration attempt), and
was missing webApplicationInfo + RSC permissions — a package built from
it could install but never read channel history. The actual working
package (built via teams CLI) used the correct id but generic
"Developer/example.com" branding.

Merge the two: correct id (fb24e83f-…), manifest schema 1.25, RSC perms
(ChannelMessage.Read.Group, ChatMessage.Read.Chat), personal tabs, and
Beever AI branding. Bump to 1.0.3 (dev portal auto-bumped to the same).

build-package.mjs now writes beever-atlas-teams.zip (the name actually
used/uploaded everywhere) instead of beever-atlas-bot.zip; .gitignore
updated to match so the build artifact stays untracked.

Constraint: dev-portal catalog already at version 1.0.2 — local manifest must be ≥1.0.3
Rejected: tracking the built zip in git | reproducible artifact, build script exists for that
Confidence: high
Scope-risk: narrow
Directive: botId must equal the AAD app id fb24e83f-… — never regenerate it independently

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
alan5543 added a commit that referenced this pull request Jun 1, 2026
Two CI fixes surfaced by the first full-suite run on this branch (it only
got smoke tests while stacked on #206):

- ruff format: connections.py had a 3-line wrap ruff collapses to 1
- CodeQL js/tainted-format-string (HIGH): the persist-failure warn passed
  an interpolated template string as console.warn's first arg alongside a
  second arg — Node treats arg[0] as a printf format string when more args
  follow, so a connectionId containing %s/%j would hijack substitution.
  Switched to the constant-format-string + %s args pattern already used
  elsewhere in this file (e.g. connection route error logging).

Confidence: high
Scope-risk: narrow

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
alan5543 added a commit that referenced this pull request Jun 1, 2026
…rms (#211)

* feat(teams): persist aadGroupId to Mongo for parity with other platforms

Other platforms (Slack/Discord/Mattermost) bootstrap channel listings
from a bot token stored in Mongo — identity survives bot restart and
Redis loss without an inbound webhook. Teams was the outlier: there is
no app-only Microsoft Graph endpoint to enumerate "teams this app is
installed in", so the team's AAD group id was observed from Bot
Framework activities and cached ONLY in the chat-adapter's Redis
(`chat-sdk:cache:teams:channelContext:*`). A Redis container restart or
30-day cache TTL erased that identity and the Teams workspace vanished
from the sidebar until the next inbound webhook reseeded it.

This change makes Teams self-bootstrap exactly like the others:

  1. Pydantic model: new `teams_known_team_ids: list[str]` on
     `PlatformConnection` (default `[]`). No DB migration — legacy docs
     decode through the default.

  2. Store: `add_teams_known_team_id` upserts via `$addToSet` so
     concurrent writes from multiple webhook deliveries can't double-
     insert without holding a lock.

  3. Internal API: the existing
     `GET /api/internal/connections/credentials` now returns
     `teams_known_team_ids` so the bot's startup loader sees them. A new
     `POST /api/internal/connections/{id}/teams-known-team-ids` accepts
     a validated AAD group GUID and persists it. Both behind the
     existing `require_bridge` gate.

  4. Bot bridge: `recordTeamsConversation` now fires a fire-and-forget
     POST whenever a NEW aadGroupId arrives for a connection (dedup
     in-memory before queueing). Backend dedups again via `$addToSet`.

  5. Bot startup: `syncConnectionsFromBackend` calls a new exported
     `seedTeamsKnownTeamIds(connId, ids)` after registering each Teams
     adapter, hydrating the in-memory `teamsKnownTeamIds` Map straight
     from Mongo. `seedTeamsKnownTeamIds` also flips
     `teamsColdStartScanned` for that connection so the Redis scan path
     can't race the hydrated state.

Result: a `docker compose down -v && up -d` (or any Redis/bot restart)
returns the Teams workspace + channels to the sidebar on first
listChannels — no webhook required. Existing connections benefit on
their FIRST inbound activity after deploy: the write-through fires,
Mongo gets the team-id, every subsequent restart hydrates cleanly.

Verified:
- 9/9 new bot unit tests (bridge.teams-persistence.test.ts): seed
  hydrates / no-op on empty / per-connection scoped / idempotent;
  write-through POSTs once / dedups same id / re-fires different id /
  rejects malformed GUID / no-op when aadGroupId absent.
- 88/88 bot bridge tests pass (was 79).
- 14/14 platform_store tests pass: default `[]`, round-trip preserves
  ids, legacy docs decode, `$addToSet` operator + `updated_at` touch
  asserted, returns None on missing connection.
- bot lint: 0 errors. tsc --noEmit clean.
- Python ruff clean on all three changed files.

Constraint: Bot Framework hands the team identity via webhooks only;
  no app-only Graph endpoint for "list installed teams" exists.
Constraint: The fire-and-forget POST must not block webhook
  processing — 5s timeout + caught errors + dedup.
Rejected: Reading the chat-adapter Redis cache as primary source |
  cache TTL (30d) and restart-fragility was the original bug.
Rejected: Migrating PlatformConnection schema | Pydantic default `[]`
  covers legacy rows, no down-revision needed.
Confidence: high
Scope-risk: narrow — additive (model field, one store method, one new
  endpoint, one bridge export, one startup seed call).
Directive: `seedTeamsKnownTeamIds` MUST also flip
  `teamsColdStartScanned[connectionId]` so a stale Redis cache can't
  race the hydrated Map state. Do not separate these without re-
  testing the restart drill.
Not-tested: Live `docker compose down -v && up -d` drill — requires
  rebuilt bot + backend images on a live stack; deferred to merge
  validation by the operator. The unit test surface covers the
  helpers; the smoke test plan is in the PR body.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(teams): cold-start scan also writes through to Mongo

The cold-start Redis scan in `resolveTeamIds` pre-populates the
in-memory `teamsKnownTeamIds` Map when an existing connection has
channelContext entries cached from a prior run. After that, every
subsequent webhook short-circuits the write-through in
`recordTeamsConversation` (because its dedup checks the in-memory
state), so an existing connection upgraded to this PR would never
seed `teams_known_team_ids` in Mongo and bot-restart-survives-Redis-
wipe never engages.

Fix: when the cold-start scan adds a NEW id to the Map, also fire the
fire-and-forget write-through. Backend `$addToSet` keeps it
idempotent. The new-connection path is unchanged — first webhook
still seeds Mongo via the existing `recordTeamsConversation` branch.

Verified:
- 88/88 bridge tests still pass (no behavior change for new connections).
- tsc --noEmit clean.

Confidence: high
Scope-risk: narrow — single block inside the existing scan loop.
Directive: keep the in-memory `teamSet.add(tId)` AFTER the `wasNew`
  check; reversing the order silently never persists.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(ci): ruff format connections.py + constant format string for CodeQL

Two CI fixes surfaced by the first full-suite run on this branch (it only
got smoke tests while stacked on #206):

- ruff format: connections.py had a 3-line wrap ruff collapses to 1
- CodeQL js/tainted-format-string (HIGH): the persist-failure warn passed
  an interpolated template string as console.warn's first arg alongside a
  second arg — Node treats arg[0] as a printf format string when more args
  follow, so a connectionId containing %s/%j would hijack substitution.
  Switched to the constant-format-string + %s args pattern already used
  elsewhere in this file (e.g. connection route error logging).

Confidence: high
Scope-risk: narrow

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant