Skip to content

feat(channels): add Bitrix24 messenger channel integration#1057

Closed
tech-synity wants to merge 35 commits into
nextlevelbuilder:devfrom
tech-synity:feat/bitrix24-channel
Closed

feat(channels): add Bitrix24 messenger channel integration#1057
tech-synity wants to merge 35 commits into
nextlevelbuilder:devfrom
tech-synity:feat/bitrix24-channel

Conversation

@tech-synity
Copy link
Copy Markdown

Summary

Integrates Bitrix24 as a first-class messenger channel for GoClaw, on par with Telegram / Discord / Zalo / Feishu / WhatsApp. Built and verified end-to-end against a live Bitrix24 cloud portal (33 commits, 81 files, +13.6k LOC).

Bot registers as a Bitrix24 `imbot` via OAuth Local Application, receives chat events through webhook, and replies via Bitrix REST. Supports DM, group chats, CRM Deal/Contact entity chats, and Tasks chats. Per-user MCP credential isolation lets multi-user portals share a single bot without cross-user credential leakage.

Why

Bitrix24 is a heavily used CRM/messenger platform in the SYNITY partner network. Adding it as a native channel unlocks GoClaw deployment for those partners' agents (sales-ops, support, internal Q&A) without an external bridge. Tested live on `tamgiac.bitrix24.com` in DM, group, Deal-bound and Tasks-bound chats over multiple weeks of dogfooding (full runbook + retro at `plans/bitrix24-mcp-refactor/reports/`).

Scope

  • New channel package: `internal/channels/bitrix24/` — channel lifecycle, webhook router, dedup, register, send, MCP provisioning
  • New store: `bitrix_portals` (PostgreSQL + SQLite, AES-GCM encrypted credentials/state)
  • New migration: `000058_bitrix_portals` (`RequiredSchemaVersion` 57 → 58, SQLite `SchemaVersion` 26 → 27)
  • New CLI: `bitrix-portal create | list` for tenant admins to provision portals
  • Agent layer: `resolveActorUserID` helper — uses webhook `SenderID` for per-user MCP credential lookup in group chats (channel-agnostic; future Telegram/Slack per-user OAuth gets it for free)
  • Gateway: forwards `CHAT_ENTITY_TYPE` + `CHAT_ENTITY_ID` into bus metadata; agent system prompt receives Bitrix entity binding hint when present
  • Markdown → BBCode renderer for replies; `MENTIONED_LIST` parsing with `MESSAGE_ORIGINAL` fallback for old portals
  • Web UI: bot type + MCP config fields exposed for bitrix24 channel
  • Documentation: setup runbook, retrospective, MCP integration plan in `plans/bitrix24-mcp-refactor/`

Architecture

  • Portal vs Channel: `bitrix_portals` row (per-tenant OAuth cred + token state) is referenced by N `channel_instances` rows (each = 1 imbot in 1 portal). Bootstrapped lazily — gateway fail-soft logs and continues if portal table not yet migrated.
  • Tenant isolation: MCP server lookups scoped by `store.WithTenantID(ctx, c.TenantID())` (matches Discord/Zalo/Feishu pattern).
  • Per-user MCP creds: `SenderID` (Bitrix `FROM_USER_ID`) drives credential routing in group chats; `UserID` fallback for DM / synthetic senders. No cross-user leakage.
  • Webhook signature: verified per Bitrix24 imbot REST contract; `app_token` bootstrapped from first `ONAPPINSTALL` event.
  • Idempotent register: persisted bot id; `BITRIX24_FORCE_REREGISTER=1` env bypasses cache when `public_url` rotates (tunnel / new host).

Test plan

  • `go build ./...` (PG) — passes (`cmd/pkg-helper` Linux-only break is pre-existing on Windows host, unrelated)
  • `go build -tags sqliteonly ./...` — passes
  • `go vet ./internal/channels/bitrix24/...` — clean
  • Unit tests: `go test ./internal/channels/bitrix24/...` — 6 new tests + existing pass (~2s)
    • `TestResolveActorUserID` — group/DM/synthetic sender routing
    • `TestBuildAddressMention` — BBCode mention rendering, self-mention guard
    • `TestParseEvent_*_ChatEntity` — entity metadata extraction
    • `TestHandleMessage_ChatEntityForwardedAsMetadata` — bus metadata propagation
    • `TestIsGroupMessageType` — CHAT_TYPE=X (Tasks) classified as group
  • Live verification on `tamgiac.bitrix24.com`:
    • DM with bot in Messenger
    • Group chat @-mention
    • CRM Deal-bound chat
    • Tasks-bound chat (`CHAT_TYPE=X`)
    • Multi-user group chat with per-user MCP credentials
    • Public URL rotation (`BITRIX24_FORCE_REREGISTER=1` round-trip)
  • Maintainer to run integration tests on CI (require pgvector pg18) — local skipped (no Docker DB on dev host)

Migration coordination

The migration `000058_bitrix_portals` is renumbered from local `000056` after rebasing onto current `origin/dev` (which now has `000056_vault_chat_id` and `000057_heartbeat_provider_fk_set_null`). `RequiredSchemaVersion` bumped 57 → 58. `bootstrapPortals` swallows "relation does not exist" with warn-log, so gateways without the migration applied still boot cleanly — Bitrix24 channels just won't function until `migrate up` runs.

Known follow-ups (non-blocking)

Spotted during pre-PR self-review; happy to address in this PR if requested, otherwise will land as a follow-up:

  1. Prompt-injection hardening on `bitrix_chat_entity_type/id` before system-prompt interpolation (`cmd/gateway_consumer_normal.go:292-300`). Threat is low (signed webhook → trusted portal) but cheap defense-in-depth: cap length + reject newlines.
  2. Numeric-ID validation in `buildAddressMention` (`internal/channels/bitrix24/send.go:166`) — `strconv.Atoi(userID)` to harden against future BBCode-injection paths.
  3. Comment on `internal/channels/bitrix24/register.go:54` to document that `BITRIX24_FORCE_REREGISTER` is read per-`Start()` (intentional — operator can rotate without process restart, unlike `BITRIX24_LOG_RAW_EVENT` which is package-init).

Security notes

  • All portal credentials and OAuth state encrypted at rest via `internal/crypto/aes.go` (AES-256-GCM); empty key falls back to plaintext with warn-log per existing crypto contract.
  • Webhook signature verified before any side effect.
  • No PII / message bodies in logs; structured `slog` with channel/portal/bot_id only.
  • Tenant scope enforced on all DB writes (`tenant_id` FK, `UNIQUE (tenant_id, name)`).

Documentation

  • Setup runbook: `plans/bitrix24-mcp-refactor/reports/setup-runbook.md`
  • Retrospective: `plans/bitrix24-mcp-refactor/reports/retrospective.md`
  • Rev5 MCP integration plan: `plan/goclaw-mcp-integration.md`
  • Final PR review: `plans/bitrix24-mcp-refactor/reports/pr-review/10-FINAL.md`

tech-synity and others added 30 commits April 28, 2026 16:32
Ships the native Bitrix24 channel — OAuth2 portal lifecycle, shared
webhook router (/bitrix24/install + /bitrix24/events), imbot.register
with idempotent recovery, message send with chunking + rate-limit retry,
mention handling in group chats, and per-portal/per-bot metadata on the
bitrix_portals row.

Wiring:
- cmd/gateway.go: factory register + BootstrapPortals warms router on start
- internal/store/*: BitrixPortalStore interface + PG/SQLite impls
- migrations/000056 + SQLite schema v25: bitrix_portals table
- internal/upgrade/version.go: RequiredSchemaVersion 55 -> 56

Security / hardening applied during code review:
- /bitrix24/install body capped via http.MaxBytesReader (DoS pre-auth)
- /bitrix24/events body capped; application_token verified constant-time
- fetchAvatarBase64 rechecks scheme on every redirect (SSRF defense)
- Portal refresh loop ctx detached with context.WithoutCancel so a
  request-scoped Start ctx cannot silently kill the token refresh
- Mention regex rewritten to (?s).*? so nested BBCode in display names
  strips correctly instead of leaking raw tags into the agent prompt
- chunkText boundary accepts offset 0 (invariant belongs here, not caller)
- dispatcher goroutine ctx detached so in-flight DB/LLM calls survive
  past handler return

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- ONIMBOTDELETE path: document that the second MarkStopped call
  intentionally overrides the generic Stopped summary set by Stop
  with an operator-visible reason. Previous comment read as though
  the line were redundant.
- stripMention: remove redundant TrimSpace of the regex replacement
  result. Trim responsibility now lives with the caller
  (handleMessage), which already TrimSpaces uniformly for both DM
  and group paths. No behavioural change.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The bare pattern `pkg-helper` on line 6 matched any file or directory
named pkg-helper anywhere in the tree — including the tracked source
directory `cmd/pkg-helper/`. Git tolerated this because already-tracked
files override ignore rules, but uploaders that apply .gitignore patterns
naively to the filesystem (notably `railway up`) stripped the source
directory from the build context, breaking `go build ./cmd/pkg-helper`
in Docker with `stat /src/cmd/pkg-helper: directory not found`.

Anchor both binary-artifact patterns (`/openclaw-go`, `/pkg-helper`) to
the repo root so they only match the compiled output, not the source.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Previously, builds without the `embedui` tag left "/" unhandled,
so http.ServeMux returned a bare 404 when an operator opened the
deployed URL in a browser. That's needlessly opaque — it looks
like the service is broken even though the gateway is healthy.

Register a small JSON index handler for "/" in the no-UI branch.
It returns service status + protocol version + a list of real
endpoints for an exact "/" match, and a JSON 404 for any other
unmatched path (http.ServeMux routes "/" as a catch-all).

Embedui builds are unchanged — the webui.Handler() still owns "/"
and takes precedence.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The admin UI hid Bitrix24 from the "Add Channel" dropdown even though
the backend factory (`internal/channels/bitrix24`) was wired into
gateway startup. Operators had no way to create a Bitrix24 channel
instance from the UI, so Phase 03 was effectively dark.

- CHANNEL_TYPES: prepend `{ value: "bitrix24", label: "Bitrix24" }`.
- credentialsSchema["bitrix24"] = []. Bitrix24 has no per-bot secrets
  — portal-level OAuth lives on `bitrix_portals` and is authorized
  once via `/bitrix24/install`.
- configSchema["bitrix24"]: portal / bot_code / bot_name (required),
  public_url (required — imbot.register demands absolute callback
  URLs), plus dm_policy, group_policy, require_mention,
  reaction_level, text_chunk_limit, media_max_mb, allow_from,
  group_allow_from, history_limit, streaming, block_reply. Defaults
  match `applyConfigDefaults` in the Go factory so first-time saves
  don't drift from the backend's zero-config behavior.

i18n: no new fieldConfig keys added — ChannelFields falls back to the
inline `label`/`help` via `t(..., { defaultValue })` so English
strings render without catalog entries. Translations can land later
without blocking the feature.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Phase 03 wired the Bitrix24 OAuth install handler at `/bitrix24/install`
but shipped no RPC/UI path for creating the `bitrix_portals` row that
handler looks up. Operators currently have to shell into Postgres to
register a portal, which blocks the UI-driven setup flow that the
channel picker now offers.

This adds a direct-DB CLI command mirroring the `cmd/migrate.go`
pattern (resolveDSN → sql.Open(pgx, dsn)):

- `bitrix-portal create --tenant-id --name --domain --client-id
  --client-secret` writes one row via `PGBitrixPortalStore.Create`, so
  credentials are AES-256-GCM encrypted under `GOCLAW_ENCRYPTION_KEY`
  just like runtime writes. When the key is unset we log a visible
  WARNING — the pg store falls back to plaintext and we do not want
  operators to discover that silently.

- `bitrix-portal list [--tenant-id]` prints id/tenant/name/domain.
  Credentials are deliberately never printed; rows whose creds failed
  to decrypt surface as `(creds:empty)` so a corrupt row is obvious.

- `normalizeBitrixDomain` strips scheme + trailing slash so
  `https://tamgiac.bitrix24.com/` and `tamgiac.bitrix24.com` both land
  as the bare host Bitrix24's OAuth callback compares against.

After `create`, the printed hint guides the operator to the install
URL: `https://<public_url>/bitrix24/install?state=<tenant>:<name>`,
which populates the row's `state` column with the OAuth token.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…cake parity)

Creating a Bitrix24 channel_instance failed with "invalid channel_type"
even though Phase 03 wired the factory and the UI added Bitrix24 to the
picker. The write path hits two parallel whitelist switches — one for
WS RPC in internal/gateway/methods and one for HTTP in internal/http —
and both missed bitrix24. The WS switch also silently missed facebook
and pancake, so anything driven through the WS API was rejecting
channels the HTTP API already accepted.

- Add bitrix24 to both switches.
- Add facebook and pancake to the WS switch so the two functions stop
  drifting. CHANNEL_TYPES in the UI constants file was already offering
  both.
- Annotate each switch with a keep-in-sync comment pointing at its twin
  and at ui/web/src/constants/channels.ts. The previous drift was silent
  (untyped string list across three separate files); the comment at
  least makes the next channel addition noisy enough to catch.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Phase 03 only handled the OAuth2 marketplace app install flow (where
Bitrix24 redirects to /bitrix24/install with ?code=...&state=... and the
app calls oauth.bitrix.info/oauth/token to trade the code for tokens).
A Local Application registered via Developer resources > Local
application uses a different handoff: Bitrix24 POSTs AUTH_ID,
REFRESH_ID, AUTH_EXPIRES, DOMAIN, member_id, and application_token
directly into the handler URL, with no code to exchange — the tokens
are already minted.

Without this, admins using the Local App path (the easiest way to
test without publishing to Bitrix24.Market) hit `400 missing code or
state` and could never finish install, leaving the portal row with
an empty state column and imbot.register failing on first channel
start.

- webhook.go handleInstall now sniffs the form body: AUTH_ID +
  REFRESH_ID present -> Local App branch; code + state present ->
  OAuth branch. Either missing gets a 400 with a message naming both
  field sets so the operator sees which flow the payload matched.
- handleInstallLocalApp resolves the portal by DOMAIN (Local App has
  no state param to carry tenant:name). Builds a TokenResponse from
  form fields and hands it to Portal.InstallFromTokens.
- portal.go InstallFromTokens mirrors Exchange but skips the
  ExchangeAuthCode HTTP call. Rejects a half-install (either token
  missing) up front so we never persist a wedged row. Detaches the
  context for persistence for the same reason Exchange does — once
  Bitrix has handed us tokens, a canceled handler context must not
  prevent writing them to disk.
- webhook.go strconv import for AUTH_EXPIRES parsing (falls through to
  Portal.applyTokenResponse's defaultTokenTTL clamp on unparseable).

Handler path config in the Local App UI still needs to be split:
`Your handler path` -> /bitrix24/events (event callbacks after install)
`Initial installation path` -> /bitrix24/install (install handshake)
Before this commit, pointing both at /bitrix24/install would have
failed on the install POST anyway; now the install succeeds but
events need the /events route to reach handleEvent.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Both happy paths were silent, so when operators asked "did the reinstall
POST arrive?" or "did Bitrix POST anything to /bitrix24/events when I
chatted?" the logs could not answer. This hurt diagnostics badly: a
mis-configured handler URL on the Bitrix side is indistinguishable from
a working one until somebody checks the Bitrix app admin.

- handleInstallLocalApp: emit INFO after persistState succeeds, with
  tenant/portal/domain/member_id/expires_in. This confirms the Local App
  install flow ran end-to-end (first-install and reinstall both).
- handleEvent: emit INFO at entry with remote addr, Content-Length, and
  User-Agent — even before ParseEvent, so malformed bodies still show.
  Tagged with a note that this can drop to Debug once wire-up is stable.

No behavior change. Both lines are slog.Info, no PII beyond member_id
(already persisted in portal state) and the remote IP (Bitrix24's).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Root cause of "handler URL bound but no events ever POST": after a
fresh imbot.register the EVENT_MESSAGE_ADD / ONIMBOTJOINCHAT / ONIMBOTDELETE
URLs showed up correctly in event.get, yet chatting the bot produced no
bitrix24 event: inbound log line. Proved via probe:

  app.info -> { "INSTALLED": false, ... }

Bitrix24 silently suppresses imbot events while INSTALLED=false, even
though the handler URLs are otherwise set up. The signal to flip INSTALLED
to true is `BX24.installFinish()` called from the install handler's HTML
response — without it the app is stuck in "install in progress" forever.

Our installSuccessHTML was pure static HTML: a title, a paragraph, and a
window.close() timeout. No BX24 SDK, no installFinish. Both install flows
served it (OAuth and Local App), so both flows left INSTALLED=false.

- Load //api.bitrix24.com/api/v1/ so the BX24 JS SDK is available in the
  install popup iframe.
- In the body script, BX24.init(() => BX24.installFinish()). Wrap in
  try/catch so a direct-URL smoke test (outside the Bitrix iframe) still
  falls through to window.close() instead of leaving the tab orphaned.
- Long comment on the constant explaining why this matters — the next
  engineer to edit this HTML must not strip the installFinish call.

Scheme-relative src is intentional — the install popup runs inside the
Bitrix24 iframe which is always https, so //api.bitrix24.com inherits the
correct protocol without hard-coding it.

After this ships, a reinstall (click REINSTALL in Local App admin) will
mark INSTALLED=true and Bitrix starts delivering events.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Working tree kept surfacing `goclaw.exe` / `goclaw-local.exe` (Windows
dev builds) and ad-hoc `tmp-reset-bot/` probe scripts as untracked.
A single `git add .` would commit them — both are useless outside the
author's machine and the debug probes import `internal/` packages from
outside `cmd/` (ugly but legal), so they should never ride into Git.

Add to .gitignore (anchored like the existing `/openclaw-go` entry, so
matching doesn't accidentally touch source dirs — same concern that
broke Docker builds in 2de99e26):

  /goclaw.exe
  /goclaw-local.exe
  /tmp-reset-bot/
  /tmp-probe*/

Also add .railwayignore file — new; Railway copies the working tree
verbatim and already ignored tests/node_modules/etc. Add `tmp-*` glob
so ad-hoc probe dirs don't bloat image layers on `railway up`.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Routine dependency refresh. v2.12 pulls in a new indirect dep
(git.sr.ht/~jackmordaunt/go-toast/v2 v2.0.3 — used by Wails'
notification path) and otherwise contains only maintenance fixes
from upstream.

Verified: `go build -tags sqliteonly ./internal/... ./cmd/` green —
desktop edition's Wails bindings still compile. No goclaw code calls
the notification API directly, so the new indirect is latent unless
Wails decides to pull it into the main path later.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… in strict mode

OpenAI strict mode rejected `use_skill` calls with:
  invalid_function_parameters: 'additionalProperties' is required to be
  supplied and to be false
  at path ('properties', 'params', 'type', '0')

Root cause: applyStrictMode early-returned when a node was
`{"type":"object"}` with no nested `properties` key, leaving
additionalProperties unset. Then makeNullable widened the type to
`["object","null"]`, and the strict validator checks each branch
independently — the "object" variant was missing additionalProperties
so the whole schema failed validation.

Fix: when typ=="object", set additionalProperties:false even if the
node has no inner properties. This covers the common "bag of
free-form params" shape tool authors write as:

  params: { type: "object", description: "..." }

Trade-off worth documenting: strict mode + bare object ⇒ OpenAI will
reject any call that actually populates those params. If a tool truly
needs a free-form dict of arbitrary keys, strict mode is the wrong
fit — either declare the known keys under `properties`, or opt that
tool out of strict. Left a comment on the branch spelling this out.

Regression test added: TestApplyStrictMode_BareObjectProperty asserts
both `additionalProperties:false` AND the `["object","null"]` type
union survive the transform.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Bitrix24 chat (imbot.message.add) renders a restricted BBCode subset,
not Markdown. LLM replies emitted `**bold**`, `*italic*`, `[link](url)`,
triple-backtick fences, etc. — all of which showed up as literal
characters in the Bitrix bubble, making bot replies look unpolished
and hard to scan.

- format.go: new markdownToBitrixBBCode + helpers covering:
  * bold (**x** __x__ <b> <strong> → [b]x[/b])
  * italic (*x* _x_ <i> <em> → [i]x[/i]) with intra-word underscore guard
  * strikethrough (~~x~~ <s> <del> → [s]x[/s])
  * inline code (`x` <code> → [code]x[/code])
  * fenced code (```lang\n…``` → [code]\n…\n[/code], lang hint dropped)
  * links ([t](url) <a href="url">t</a> → [url=url]t[/url])
  * headers (#, ##, … → [b]…[/b], Bitrix has no header tag)
  * lists (-, *, + → • ; ordered 1. passes through)
  * blockquote (> x → [quote]x[/quote]) grouping consecutive lines
  * tables (GitHub-style | h | sep | body → [code] monospace block)
  * horizontal rule (---, ***, ___ → unicode ──── divider)

  Placeholder scheme uses \x00…\x00 framing so nested extraction
  (fenced > table > inline) can't collide. Input NULs are stripped on
  entry (would otherwise corrupt restoration).

- format_test.go: 25 table-driven cases covering each transform plus
  3 regression cases (italic adjacent pairs needing stability loop,
  single-line fenced losing `code` as phantom lang hint, NUL-input
  collision with placeholder framing), plus a mixed-feature Vietnamese
  smoke test (#TESTS).

- send.go: call markdownToBitrixBBCode on `msg.Text` before chunking.
  Conversion BEFORE chunking so chunker operates on final wire shape —
  splitting half-converted Markdown would leak a lone `**` to the
  client. Caveat: chunker is still tag-agnostic, so a BBCode pair
  straddling the 4000-rune boundary can still split across chunks.
  Documented in the callsite; fix when it becomes visible in practice.

- send.go: move client/botID re-check INSIDE sendChunk instead of
  holding the result of Send's liveness check across the chunk loop.
  Shutdown during a multi-chunk reply now surfaces as a clear
  "channel lost during send" error on the chunk that raced, rather
  than silently calling a nil client.

Idempotent on already-BBCode input — the conversion regexes key off
Markdown markers that don't appear in [b]/[i]/[code]/[url=…] syntax.
Locked by TestMarkdownToBitrixBBCode_IdempotentOnBBCode.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Bitrix24 Local Application install sends AUTH_ID / REFRESH_ID /
member_id / DOMAIN but OMITS application_token. That token first
appears in the event stream (ONAPPINSTALL / ONIMBOTMESSAGEADD / …).
Before this change, every event was rejected with "portal not
installed" because `want := portal.AppToken()` was empty, and the
only recovery was a manual DB patch.

- portal.BootstrapAppToken: accept event's app_token ONLY when all of
  (a) no stored app_token yet (never overwrite a good value),
  (b) stored MemberID is non-empty AND matches the event's member_id.
  Idempotent on re-call. Refuses to seed MemberID from the event body
  — if install didn't populate it, something upstream is wrong and
  we must not let the first (potentially spoofed) event pin the
  portal's identity. Legacy rows without MemberID require reinstall.

- handleEvent: when portal.AppToken() is empty and the event carries
  a non-empty app_token, call BootstrapAppToken; on success fall
  through to the normal secureEqual() check. Failure is logged as
  security.bitrix24_apptoken_bootstrap_failed and returns 401.
  Existing classic path (no stored + no event → 401) preserved.

- router_test.go: 3 new tests covering the TOFU trust boundary:
  * BootstrapAppToken_200 — happy path, second event auths normally.
  * BootstrapAppToken_MemberIDMismatch_401 — spoofed event with wrong
    member_id must not poison the stored token.
  * BootstrapAppToken_NoStoredMemberID_401 — MemberID-less legacy row
    refuses auto-heal; reinstall is the only safe recovery.
  Existing PortalNotInstalled_401 updated to pass empty event token
  so it keeps testing the no-seed-material path.

Opt-in raw event dump:
- webhook.go: BITRIX24_LOG_RAW_EVENT=1 env at process start arms a
  sorted, credential-redacted dump of every parsed form body. Off by
  default — the dump leaks user message text and is noisy — but
  indispensable for one-shot captures when debugging new Bitrix24
  payload shapes.
- isRedactedEventKey matches on the LEAF key name (stripping bracket
  path), so the SAME secret duplicated under `auth[access_token]`,
  `data[BOT][id][access_token]`, and `data[BOT][id][AUTH][access_token]`
  is caught at every nesting. Covers access_token / refresh_token /
  application_token / client_id / client_secret / AUTH_ID / REFRESH_ID.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Reverts the custom bitrix_mcp_user_mapping store + ensureMCPCredentials
gatekeeper added in c675dd6a/e5f2fe34. Audit revealed the integration
duplicates partner-owned MCP infra: MCPServerStore.GetUserCredentials
already covers per-user credentials, GrantChecker.IsAllowed already gates
access with cache + event-bus invalidation, and ContactCollector already
handles channel-to-canonical identity resolution.

Phase A scope — strip everything that competes with existing infra:
  - Drop bitrix_mcp_user_mapping table + PG/SQLite migrations
  - Drop store.BitrixMappingStore interface + impls
  - Drop ensureMCPCredentials + 4-arg FactoryWithPortalStore signature
  - Drop mcp/mapping store fields on Channel

Keep two pieces that stand on their own:
  - isGroupMessageType() fixes MESSAGE_TYPE parsing (Bitrix webhooks use
    short codes P/C/O, original code only matched legacy "chat")
  - mcp_client.go (HTTP auto-onboard client) — unwired for now, reused
    in Phase C as the bitrix24-local lazy provisioner once Phase B
    wires the channel into GrantChecker + MCPServerStore properly

See plans/bitrix24-mcp-refactor/README.md for full phased plan.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ector

Call ContactCollector.EnsureContact inside handleMessage after policy
gating, matching the Telegram/Zalo pattern. This plugs Bitrix24 into
the partner's existing contact table so downstream layers (agent loop,
MCPUserCredentials lookup, admin UI) resolve senderID → userID through
the same path every other channel uses. No parallel mapping layer.

- DM: 1 upsert (sender, peer=direct, ctype=user)
- Group: 2 upserts (sender as peer=group/ctype=user, chat as
  peer=group/ctype=group)
- System messages & empty content are filtered BEFORE the call, so
  blocked senders are never recorded.

Tests in handle_test.go use a fakeContactStore mock to assert call
count and argument shape for DM / group / blocked paths.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Before this change imbot.register was called with TYPE="B" hardcoded,
so goclaw could only register standard internal chatbots. Bitrix24
supports a second type "O" (Open Channel bot) for customer-facing
widgets, which we need to support for external-customer deployments.

Changes:
- bitrixInstanceConfig: new BotType field with full doc block covering
  the operational differences (audience, policy recommendations, MCP
  implications).
- applyConfigDefaults: empty BotType → "B" to preserve existing
  deployments' behavior (matches Bitrix24's imbot.register default).
- Factory: validate BotType ∈ {"B", "O"} AFTER defaults. Unknown values
  (including lowercase / whitespace-only) are rejected at load, not at
  runtime — prevents silent no-op bots when admin typos the config.
- registerParams: forward cfg.BotType to TYPE verbatim.
- Channel.IsOpenChannelBot(): helper for downstream code (Phase C
  provisioner will use this to skip per-user MCP credential minting
  for transient Open Channel customers).

No policy auto-switching based on BotType — admin is responsible for
setting dm_policy=open when deploying an Open Channel bot, matching
the "1 channel = 1 fixed default" convention other channels follow
(Bitrix24 = pairing, Zalo personal = allowlist).

Tests cover: default = "B", explicit "B" / "O", rejection of unknown /
lowercase / whitespace-only values, IsOpenChannelBot() matching the
type, and registerParams forwarding the type to the API call body.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds per-user MCP credential minting on first message. Opt-in via instance
config (mcp_server_name + mcp_base_url) AND env GOCLAW_BITRIX_MCP_ADMIN_TOKEN.
Off by default — channels without the wiring operate identically to before.

Behavior:
- On first message from user U, channel checks mcp_user_credentials(server,U).
  Present → skip. Absent → POST {base_url}/api/auto-onboard with admin token
  + U's OAuth tokens, store returned api_key plus OAuth bundle in the Env
  map so MCP server can re-authenticate on subsequent tool calls.
- Open Channel bots (bot_type=O) are skipped entirely: transient customers
  don't map to tenant_users and shared-credential support lands in Phase E.
- Every failure mode is best-effort: logged via slog.Warn/Debug and swallowed.
  Agent loop downstream sees no creds → skips that MCP server's tools, which
  is strictly better UX than denying the user's message. Typed sentinel
  errors (ErrProvisionSkippedOpenChannel / ErrProvisionDisabled /
  ErrProvisionDebounced) let tests assert behavior without string matching.
- 60s per-(serverID,userID) debounce absorbs Bitrix24 webhook retry bursts
  (3–5 events/sec on transient 5xx) without hammering the MCP server.

Wiring:
- New FactoryWithPortalStoreAndMCP factory variant. Old FactoryWithPortalStore
  becomes a thin shim (nil mcpStore) so other call sites stay on the simpler
  2-arg signature. Gateway switches to the MCP-aware variant.
- Half-config (only one of mcp_server_name / mcp_base_url set) fails fast
  at factory load, preventing a silent "provisioning disabled but I meant
  to enable it" misconfiguration.
- Admin token stays in env instead of config — channel_instances.config is
  plaintext JSONB, and Phase D RFC will propose a per-tenant secret store.

Tests: 9 new tests in provisioner_test.go covering Open Channel skip,
disabled modes, existing-creds fast path, mint + persist happy path (including
EXPIRES_AT format), debounce, HTTP failure surfacing, missing-auth validation,
init disabled modes, and half-config factory rejection.

Build/vet/test all green on PG and SQLite.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Phase C follow-up addressing two pieces of technical debt flagged during
the multi-tenant review:

1. Admin token resolution:
   - Primary source: mcp_servers.env["BITRIX_MCP_ADMIN_TOKEN"] (per-tenant,
     encrypted at rest via MCPServerStore). Same key the MCP server process
     receives at spawn, so admin sets the secret in one place.
   - Fallback: GOCLAW_BITRIX_MCP_ADMIN_TOKEN env var (legacy single-tenant
     deployments). Kept to avoid forcing a DB migration on upgrade.
   - Logged token_source on startup so operators can see when a deployment
     is still riding the legacy global env var (multi-tenant unsafe).
   - resolveMCPAdminToken extracted for unit testing.

2. User degradation notice:
   - When provisionIfMissing hits an unexpected failure, notifyUserOfMCPIssueOnce
     sends a one-shot Vietnamese DM via sendChunk so the user knows to
     contact admin instead of silently getting tool-less replies.
   - Per-user 5min debounce (separate map / mutex from the provisioning
     debounce) absorbs webhook retry storms and sustained MCP outages
     without duplicating the notice.
   - Channel health stays Green — degradation is surfaced to the user, not
     the health endpoint, matching operator preference for silent fallback.

Test coverage:
- resolveMCPAdminToken: 6 subcases covering both-set, per-server only,
  env only, both empty, whitespace guards, malformed JSON fallback.
- initMCPProvisioner: end-to-end per-server token flows through to the
  Authorization: Bearer header on auto-onboard.
- notifyUserOfMCPIssueOnce: first-call marks debounce, second call within
  TTL keeps original stamp (no refresh), expired TTL refreshes, empty/
  whitespace inputs are no-ops, multi-user isolation holds.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Bitrix24 webhooks don't carry display_name / username, only a FromUserID
string, so the Contacts page rendered bitrix24 rows with empty Name and
Username columns. Add a lazy per-channel enrichment path: on first-sight
of each sender, call user.get once and cache the (name, username) pair
for 1h (5min for failures / unresolvable IDs so scope fixes recover
without a channel reload). Any RPC failure is logged at Debug and the
contact is still created with empty fields — the pre-enrichment path.

UI: add "bitrix24" to PERM_CHANNELS so the "Channel permissions for
contact names" panel documents the `user` scope requirement, and to
CHANNEL_TYPES so the filter dropdown surfaces the channel. Ship i18n
copy for en / vi / zh.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Backend has supported `bot_type` (B / O), `mcp_server_name`, and
`mcp_base_url` since the Phase C work landed (commits 655d24b6,
90e225b0), but the Create/Edit Channel Instance form didn't render
inputs for any of them — operators had to hand-edit the raw config
JSON. Add three schema entries:

- bot_type: select B (standard, default) / O (Open Channel). Backend
  validates the values at factory load; select keeps the options
  self-documenting.
- mcp_server_name + mcp_base_url: both marked `advanced` so they
  tuck under the Advanced panel. Help text calls out the "both or
  neither" coupling so users don't half-configure and hit the
  factory-level validation error.

Also trim `.railwayignore` to skip docs / changelog / readme assets
(_statics, _readmes, examples, CHANGELOG, CONTRIBUTING,
api-reference, websocket-protocol) — upload payload drops from
9.4 MB to 4.25 MB, which matters on slow uplinks where the full
tarball was timing out before reaching the Railway backboard.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…auth (Path B)

The MCP server now authenticates /api/auto-onboard by verifying the
caller-supplied Bitrix access_token against `profile` — no shared admin
secret required. Removes:
- BITRIX_MCP_ADMIN_TOKEN env key lookup in mcp_servers.env
- GOCLAW_BITRIX_MCP_ADMIN_TOKEN process env var
- resolveMCPAdminToken two-source fallback
- Authorization: Bearer header on auto-onboard POST

Also removes dependent test cases:
- TestResolveMCPAdminToken (6 subtests)
- TestInitMCPProvisioner_UsesPerServerToken
- TestInitMCPProvisioner_DisabledModes/missing_admin_token subtest

This unblocks marketplace distribution where each portal installs the
app and runs its own GoClaw — no per-portal shared secret provisioning.
Multi-tenant isolation holds naturally because each user authenticates
with their own per-portal OAuth tokens.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Documents the finalized integration between goclaw's custom Bitrix24 channel
and mcp-bx-syn, reflecting the shipped Path B design (access_token as auth
anchor, no shared ADMIN_TOKEN). Supersedes the earlier Rev4 draft: drops the
mapping-table proposal in favor of reusing the partner MCPServerStore, folds
in rate limiting + audit log + hourly re-verify, and aligns the contract
shape with the commits `ea09c1ba` (phase C provisioner) and `07b48ef0`
(admin-token removal).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Lets admins attach server-specific quirks (e.g. "no trailing semicolons
in code args") to MCP tool descriptions without modifying the upstream
MCP server or redeploying goclaw. Hints are stored in the existing
mcp_servers.settings JSONB (no migration) following the same pattern as
require_user_credentials.

Backend:
- Add mcp.ToolHints type + ParseToolHints() helper alongside requireUserCreds
- Add BridgeTool.WithHints(global, toolHint) setter that appends a suffix
  to Description() so the LLM sees hints as part of the tool schema
- Thread hints through connectServer / connectViaPool / loop_mcp_user
- 12 new unit tests for ParseToolHints + WithHints (+ edge cases)

Frontend:
- New "AI Agent Hints (Optional)" section in MCP form dialog: a global
  textarea + a key-value list for per-tool hints
- Extend KeyValueEditor with valueAs="textarea" prop (reusable)
- Zod schema + TS types + i18n (en/vi/zh) updated

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…GINAL

Group ONIMBOTMESSAGEADD events strip the @mention from MESSAGE before
sending the webhook, so isMentioned(MESSAGE) never matched and group
messages with require_mention=true silently dropped — typing indicator
fired (Bitrix sees the bot's IM channel is up) but no agent turn ran.

Bitrix24 ships two higher-authority mention sources on group payloads:

  - data[PARAMS][MENTIONED_LIST][<bot_id>]=<bot_id> — structured map,
    no regex / Unicode edge cases.
  - data[PARAMS][MESSAGE_ORIGINAL] — raw BBCode with the bot tag intact
    ([USER=<bot_id>]Name[/USER]).

Both absent on DMs, where MESSAGE retains the @mention naturally.

EventParams now carries both fields. parseFormEvent + parseJSONEvent
populate them. New isMentionedParams() walks the three sources in
authority order (list → original → stripped) and replaces the
single-source check at handleMessage:111. Existing isMentioned() and
stripMention() are unchanged so DMs and the BBCode strip step keep
behaving exactly as before.

Verified end-to-end against tamgiac.bitrix24.com:
  inbound (raw dump shows MENTIONED_LIST[940]=940)
    → scheduling message (peer_kind=group chat_id=chat4838)
    → agent turn → outbound published → bot reply visible in Bitrix.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
ONIMBOTMESSAGEADD events from Bitrix24 task chats arrive with
MESSAGE_TYPE=X (CHAT_ENTITY_TYPE=TASKS_TASK), but isGroupMessageType
only matched C/CHAT/O/OPEN. X fell through to direct, with two
consequences:

  1. The require-mention gate was bypassed (DM path doesn't enforce
     it), so the bot replied to every message in the chat instead of
     only when @mentioned.
  2. Routing built a session key of `direct:chatNN` and a user_id of
     the raw sender, mixing per-task conversation history into the
     user's DM context. peer_kind=group + user_id=group:<channel>:<chatID>
     is what isolates entity-bound chat memory.

Add "X" to the group switch (uppercase-normalised, like the other
codes) and extend TestIsGroupMessageType with `X`/`x`/` X ` cases.
Existing C/CHAT/O/OPEN/P/private behaviour unchanged.

Verified live against tamgiac.bitrix24.com task chat:
  raw dump → MESSAGE_TYPE=X CHAT_ENTITY_TYPE=TASKS_TASK
    → scheduling: peer_kind=group session=group:chat4644 ✓

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…data

Bitrix24 webhooks for CRM-bound, Tasks-bound, Workgroup, and Open
Channel chats ship two fields that uniquely identify the entity the
chat belongs to:

  data[PARAMS][CHAT_ENTITY_TYPE]  e.g. CRM, TASKS_TASK, SONET_GROUP, LINES
  data[PARAMS][CHAT_ENTITY_ID]    e.g. DEAL|2064, LEAD|7, 2704

The parser silently dropped both. As a result, when a user chatting
inside a deal-bound chat asked "get me the customer for this deal",
the bot had no way to deterministically resolve the bound deal — it
either had to guess from CHAT_TITLE strings or ask the user for the id.
For MCP-equipped agents this defeats the value of binding the bot to
the entity in the first place.

Add ChatEntityType + ChatEntityID to EventParams. Populate from both
parseFormEvent (form-urlencoded) and parseJSONEvent (JSON variant).
Surface them on bus.InboundMessage.Metadata as
`bitrix_chat_entity_type` and `bitrix_chat_entity_id` so agent
pipelines can:

  - inject "you are chatting about <entity>" into the system prompt, or
  - pre-fetch `crm.deal.get(2064)` and put the result in context.

Plain user-created chats omit both metadata keys — downstream readers
that do `_, ok := metadata[...]` see no key, not an empty string.

Tests cover:

  - parser populates fields from CRM Deal, CRM Lead, Tasks chats and
    leaves them empty for plain groups (form + JSON paths)
  - handler propagates them to bus.InboundMessage.Metadata with the
    expected `bitrix_*` key prefix and absent-key semantics for plain
    chats

Verified live: gateway accepts the new payload shape and continues
scheduling messages cleanly with peer_kind=group for the existing
chat 4644 (Tasks/X) test case from commit 26d7d76.

NOTE: this is the channel-side half of "bot understands the entity it's
chatting in". The pipeline still needs to consume these metadata keys
to actually inject context — separate change in internal/pipeline.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…prompt

Closes the loop on the channel-side metadata forwarding from f8e000f.
The bus.InboundMessage already carries `bitrix_chat_entity_type` +
`bitrix_chat_entity_id` whenever the inbound message comes from a chat
bound to a Bitrix24 module entity (CRM Deal/Lead/Contact, Tasks task,
Workgroup, …). Until now the consumer dropped the metadata on the floor
because the agent loop has no Bitrix-aware code path.

Add a fourth `extraPrompt` block in processNormalMessage, right after
the existing channel-self-identity hint. When both metadata keys are
present it appends a `<extra_context>`-bound hint telling the LLM:

  - which entity type/id this chat is scoped to,
  - to treat deictic phrases ("this deal", "this task") as referring
    to that id rather than asking the user,
  - how to map CRM pipe-format ids (e.g. `DEAL|2064`) to MCP tools
    (crm.deal.get / crm.lead.get / crm.contact.get / crm.company.get),
  - that Tasks ids pass directly to tasks.task.get.

The block follows the exact pattern of the three sibling blocks:
group-chat tone, MetaTopicSystemPrompt, MetaChannelSelfIdentity. Single
inline metadata-key check, fmt.Sprintf into extraPrompt, then read by
agent.RunRequest.ExtraSystemPrompt -> BuildSystemPrompt -> rendered into
the standard <extra_context> system-prompt section. No new pipeline
stage, no new agent fields, no new RPC.

Plain user-created chats omit both metadata keys -> no hint added -> no
behaviour change for non-Bitrix or non-entity-bound chats.

Verified live against tamgiac.bitrix24.com Task chat 4644 (TASKS_TASK
id=2704) using gemini-native:

  user:  "@bot Synity task này có gì"
  bot:   "Chúng ta đang trò chuyện trong một nhóm chat Bitrix24
          được liên kết với một nhiệm vụ (TASKS_TASK) có ID là 2704"

The LLM picked up the hint and articulated the binding without being
asked. (The bot doesn't yet call tasks.task.get to fetch task fields —
that's an MCP-wiring concern for a separate phase, not a hint problem.)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
In group chats with N users asking the bot in parallel, replies without an
explicit @mention force users to guess "is this answer for me?". Adding the
sender's @mention as a prefix ([USER=<id>][/USER]) addresses two UX issues:

  1. Multi-user concurrent Q&A — Bitrix renders the BBCode as a clickable
     "@Đặng Văn Tình" prefix on the reply, so the asker (and bystanders)
     immediately see who the answer belongs to. Without this prefix, when
     two users ask within a few seconds of each other, the bot's two
     replies are visually indistinguishable.
  2. Notification routing — Bitrix sends a push to the @mentioned user, so
     mobile users get notified about their own answer even when they
     stepped away from the active dialog. reply_to_message_id alone (the
     existing thread linkage) does not trigger push.

Implementation flows the addressee through the existing metadata channel
that already carries reply_to_message_id, so no signature changes:

  - cmd/gateway_consumer_normal.go: when isGroup, set
    `outMeta["bitrix_address_user_id"] = msg.SenderID`. Skip synthetic
    senders (ticker, notification, system) via bus.IsInternalSender so
    those flows do not emit a stray @mention to a non-real user.
  - internal/channels/bitrix24/send.go: read the metadata key after the
    Markdown→BBCode conversion and prepend `[USER=<id>][/USER] ` to the
    text BEFORE chunkText runs. Prepending pre-chunk guarantees the
    mention only appears on the first chunk, not every chunk of a long
    reply. Empty inner content sidesteps escaping for names with BBCode
    metacharacters and reflects renames between turns — Bitrix renders
    the user's current display name from id at delivery time.
  - Self-mention guard: if the addressee id equals the bot's own id (a
    synthetic relay loop or a future code path injecting the bot id by
    mistake), suppress the mention to avoid the bot @-mentioning itself.

DM behaviour unchanged — gateway_consumer only sets the key when
isGroup=true. Other channels (Telegram, Slack, Discord, …) ignore the
`bitrix_*` prefix as expected.

Tests:
  - TestBuildAddressMention: 7 cases covering empty/whitespace metadata,
    real ids, whitespace trim, self-mention guard, and unknown-bot-id
    fallback (trust consumer gating when bot is not yet started).
  - Send() integration coverage left to existing httptest tests; the
    helper is the entire logic.

Verified live against tamgiac.bitrix24.com group chat 4686:
  user @bot Synity test mention prepend
  bot:  @Đặng Văn Tình …answer body…
  (Bitrix renders the BBCode as a clickable @-mention; sender receives
   a push notification.)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
tech-synity and others added 3 commits April 28, 2026 16:32
…chats

In group chats the gateway consumer rewrites RunInput.UserID to a group-
scope composite key (e.g. "group:bitrix-synity:chat4838" for Bitrix24, or
"guild:<id>:user:<sender>" for Discord) so multiple users in the same
chat share conversation memory and session state. That composite is
correct for memory but wrong for resources scoped per-actor:

  - MCPUserCredentials rows are minted per-user by the lazy provisioner
    (e.g. Phase C in internal/channels/bitrix24/provisioner.go) and keyed
    by the real external user id (= SenderID). Looking them up with the
    group-composite key always missed the row, so MCP tools silently
    disappeared in group chats — visible in production as the bot
    knowing the deal/task it's bound to (via the entity-binding hint in
    cmd/gateway_consumer_normal.go) but unable to actually call
    crm.deal.get / tasks.task.get because no credentials loaded.
  - RBAC grants and audit attribution must reflect the real actor, not
    the shared group container.

Add resolveActorUserID(userID, senderID, peerKind) to the agent package.
For peerKind=="group" with a non-empty SenderID it returns SenderID;
otherwise it returns UserID unchanged (DM behaviour preserved, synthetic
ticker/notification senders fall back safely). Update the single call
site in loop_pipeline_callbacks.go (makeBuildFilteredTools) so per-user
MCP tool loading queries with the actor key.

This is a generic agent-layer fix — not Bitrix24-specific. Bitrix24 is
the only channel today that ships per-user MCP credentials, so it was
the only channel exhibiting the bug, but the same routing already flows
for any future channel adding per-user OAuth-backed integrations
(Telegram + Salesforce, Slack workspace + Odoo, internal SSO + ERP, …).

Tests: TestResolveActorUserID covers DM, Bitrix-style group composite,
Discord guild composite, empty SenderID fallback, empty PeerKind, and
unknown PeerKind (must not auto-treat as group). Existing agent tests
still pass; the three pre-existing TestExpandWorkspace failures on
Windows are unrelated (path separator issue, present on main).

Discovery context + alternative fix designs are documented in
plans/bitrix24-mcp-refactor/reports/retrospective.md (will be updated
in a follow-up commit on the bitrix24 branch).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
PGMCPServerStore.GetServerByName scopes lookups by tenant_id from context.
Channel.Start receives ctx from instance loader without that scope set —
wrap explicitly with the channel's own tenant id so the lookup matches
the row a tenant admin created via 'bitrix-portal create'.
When public_url changes (tunnel rotated, new host), persisted bot cache
short-circuits imbot.register and Bitrix-side event handler URLs go stale.
Setting BITRIX24_FORCE_REREGISTER=1 bypasses the cache so the next Start()
pushes current public_url + bot config back through imbot.register without
recreating the bot row.

Also expose BITRIX24_LOG_RAW_EVENT for raw-event debugging.
@clark-cant
Copy link
Copy Markdown

🔍 Code Review — feat(channels): add Bitrix24 messenger channel integration

🎯 Tổng quan

PR tích hợp Bitrix24 làm channel mới — OAuth portal lifecycle, webhook router, imbot, per-user MCP credentials, Markdown→BBCode renderer.

Scope: 81 files, +13.6k LOC


🤖 CI Status + Merge Conflicts

CI: ⚠️ No checks reported on branch — CI chưa chạy hoặc chưa config cho branch này
Merge conflicts: ✅ No conflicts (mergeable: MERGEABLE) nhưng mergeStateStatus: UNSTABLE


🔴 CRITICAL — Mega-PR

81 files, +13.6k LOC là quá lớn để review hiệu quả. PR này gộp ít nhất 4 independent features:

# Feature Scope
1 Bitrix24 channel core webhook, router, send, register, portal
2 MCP bridge refactor bridge_tool.go, manager.go — thay đổi shared code
3 DB migration + store bitrix_portals table, PG + SQLite impls
4 CLI + UI bitrix-portal CLI, channel picker, MCP settings

Rủi ro: Nếu có bug ở MCP bridge refactor, revert sẽ kéo theo toàn bộ Bitrix24 channel — và ngược lại.

Recommendation: Nên split thành 2-3 PRs nhỏ hơn để dễ review và revert độc lập.


🟡 HIGH — Security follow-ups nên làm trong PR này

PR self-review đã flag 3 follow-ups, nhưng 2 cái đầu là security hardening:

  1. Prompt-injection hardening trên bitrix_chat_entity_type/id trước khi interpolate vào system prompt → nên cap length + reject newlines NGAY
  2. Numeric-ID validation trong buildAddressMentionstrconv.Atoi để guard BBCode injection
  3. Comment trên BITRIX24_FORCE_REREGISTER → LOW, có thể để follow-up

Vì Bitrix24 nhận webhook từ external portal (dù có signature verification), prompt injection defense-in-depth là cheap insurance. Nên làm luôn.


🟡 MEDIUM

  1. MCP bridge refactor trong PR này. bridge_tool.go (+46/-13), manager.go (+45/-3), manager_connect.go (+16/-9) — thay đổi shared MCP code. Nên là PR riêng để test độc lập với Bitrix24.

  2. Channel whitelist drift. PR đã fix (thêm bitrix24 + facebook + pancake vào WS whitelist) — nhưng pattern này vẫn là manual string list across 3 files. Nên có test guard hoặc generated constants để tránh drift tương lai.

  3. Schema strict mode fix (schema_strict.go +14/-1) — bugfix cho OpenAI strict mode, không liên quan Bitrix24. Nên tách PR riêng.


✅ Điểm Tốt

  1. Kiến trúc sạch. Portal vs Channel separation đúng — bitrix_portals row referenced bởi N channel_instances. Bootstrapped lazily, fail-soft.

  2. Tenant isolation đúng pattern. store.WithTenantID(ctx, c.TenantID()) — matches Discord/Zalo/Feishu.

  3. Per-user MCP credentials. SenderID (FROM_USER_ID) drives credential routing trong group chat — không cross-user leakage.

  4. Security basics đúng. Webhook signature verified, AES-256-GCM encryption, no PII in logs, SSRF defense trên avatar fetch.

  5. Self-review + red-team. Author đã tự review và flag known issues — rare và rất helpful.

  6. Test coverage tốt. 6 unit tests + live verification trên tamgiac.bitrix24.com với DM, group, CRM Deal, Tasks chat.

  7. Documentation đầy đủ. Setup runbook, retrospective, MCP integration plan.


📊 Summary

Severity Count Issues
🔴 CRITICAL 1 Mega-PR (81 files, 13.6k LOC) — nên split
🟡 HIGH 2 Prompt-injection hardening + numeric-ID validation nên làm ngay
🟡 MEDIUM 3 MCP bridge refactor, whitelist drift pattern, schema strict fix
🟢 LOW 1 Comment on FORCE_REREGISTER

💡 Recommendation

🟡 APPROVE WITH CHANGES

Architecture và implementation chất lượng cao, security basics đúng. Nhưng:

  1. Nên split — ít nhất tách MCP bridge refactor + schema strict fix ra PR riêng
  2. Nên làm security hardening (prompt injection + ID validation) trong PR này vì cheap và quan trọng
  3. CI chưa chạy — cần verify tests pass trên CI trước khi merge

Great work @tech-synity! Đây là channel integration bài bản nhất em từng thấy trong repo. 🚀

@clark-cant
Copy link
Copy Markdown

🔍 Code Review — feat(channels): add Bitrix24 messenger channel integration

🎯 Tổng quan

Integrates Bitrix24 as a native messenger channel with OAuth portal lifecycle, webhook routing, per-user MCP credential isolation, BBCode mention handling, and CLI tooling.

Scope: 81 files, +13.6k LOC — MEGA-PR


🤖 CI Status + Merge Conflicts

CI: ⚠️ No checks reported on branch — CI not configured for this PR
Merge conflicts: ⚠️ mergeStateStatus: UNSTABLE — needs rebase before merge


✅ Điểm Tốt

  1. Tenant isolation đúng pattern. store.WithTenantID(ctx, c.TenantID()) cho MCP lookups, tenant_id FK với UNIQUE constraint — chuẩn theo Discord/Zalo/Feishu pattern.

  2. Per-user MCP credential isolation. resolveActorUserID dùng SenderID (FROM_USER_ID) cho group chats, fallback UserID cho DM — không cross-user leakage.

  3. AES-256-GCM encryption at rest. Portal credentials encrypted, empty key fallback có warn-log — transparent security.

  4. Webhook signature verification. application_token verified constant-time trước khi xử lý event — đúng security baseline.

  5. Idempotent register với BITRIX24_FORCE_REREGISTER. Operator xoay public_url không cần restart process — good UX.

  6. Self red-team review trong PR body. Acknowledged prompt-injection hardening, numeric-ID validation, và comment gaps — shows thorough thinking.

  7. BX24.installFinish() fix. Debug root cause tốt — Bitrix silent suppression khi INSTALLED=false là tricky bug.


🔴 HIGH — Must Address

1. MEGA-PR — Should Be Split

81 files, +13.6k LOC, 33 commits gộp quá nhiều independent features:

# Feature Files Description
1 Bitrix24 channel core ~15 files webhook, send, register, router, portal
2 Store + migration ~8 files bitrix_portals table, PG/SQLite impls, migration 000058
3 CLI tooling ~2 files bitrix-portal create/list
4 Agent layer changes ~2 files resolveActorUserID, MCP bridge
5 Gateway + HTTP whitelist fixes ~3 files channel_type whitelist, JSON index handler
6 UI changes ~8 files channel picker, schemas, i18n, MCP fields
7 Unrelated fixes ~5 files gitignore, wails dep, OpenAI strict mode, BX24.installFinish
8 Documentation ~4 files runbook, retro, MCP plan, PR review

Rủi ro: Nếu có bug ở Bitrix24 channel, revert sẽ kéo theo OpenAI strict mode fix, wails dep bump, gitignore fix — những cái này đáng ra là PR riêng.

Fix đề xuất: Tách thành ít nhất 3 PRs:

  • PR A: Infrastructure fixes (gitignore, wails dep, OpenAI strict mode, JSON index, channel whitelist) — merge trước
  • PR B: Bitrix24 store + migration + CLI — merge trước
  • PR C: Bitrix24 channel core + webhook + UI — merge sau cùng

2. No CI Checks

Branch không có CI checks nào được report. Với 81 files thay đổi, cần ít nhất go build ./... + go test ./... + go vet ./... pass trên CI trước khi merge.


🟡 MEDIUM

  1. Prompt-injection on bitrix_chat_entity_type/id. PR body đã acknowledge — nên fix trong PR này vì cheap defense-in-depth: cap length + reject newlines trước khi interpolate vào system prompt.

  2. resolveActorUserID là channel-agnostic helper nhưng nằm trong bitrix24/ package. Nếu Telegram/Slack per-user OAuth cũng cần helper này, nó sẽ bị duplicate. Nên move lên internal/channels/ hoặc internal/agent/.

  3. Migration boot-strap swallow "relation does not exist" với warn-log. Acceptable cho fail-soft, nhưng nếu operator không đọc log, Bitrix channels sẽ silent-fail cho đến khi migrate up chạy. Nên add health check endpoint hoặc startup warning rõ hơn.


🟢 LOW

  1. Co-Authored by Claude Opus 4.7. PR này do AI generate phần lớn — không phải vấn đề nếu code tốt, nhưng reviewer cần đọc kỹ hơn vì AI có thể miss edge cases trong security-sensitive paths (webhook, auth, tenant isolation).

  2. i18n fallback-only. Channel fields dùng t(..., { defaultValue }) không có catalog entries — English works nhưng vi/zh sẽ fallback. Not blocking, should follow up.

  3. 33 commits với nhiều fixup commits. History có thể squash trước merge để cleaner git log.


📊 Summary

Severity Count Issues
🔴 HIGH 2 Mega-PR cần split, No CI checks
🟡 MEDIUM 3 Prompt-injection, helper placement, silent-fail migration
🟢 LOW 3 AI-generated PR, i18n incomplete, commit history cleanup

💡 Recommendation

🟡 APPROVE WITH CHANGES

Bitrix24 integration design tốt, security đúng pattern, self red-team reviewed. Nhưng:

  1. BẮT BUỘC: Split thành 2-3 PRs như đề xuất ở trên
  2. BẮT BUỘC: Setup CI checks cho branch này
  3. NÊN: Fix prompt-injection + move resolveActorUserID lên shared package trong PR này

Great work @tech-synity! Bitrix24 là channel quan trọng cho SYNITY partner network. Chỉ cần tổ chức lại PR structure cho dễ review/revert. 🚀

…interpolation

bitrix_chat_entity_type / bitrix_chat_entity_id flow from Bitrix webhook
into the agent system prompt unsanitized. A malicious or compromised
portal could craft CHAT_ENTITY_ID with embedded newlines + injected
SYSTEM directives to steer the agent.

Add isSafeBitrixEntityToken — caps length (et: 64, eid: 128) and rejects
control characters (< 0x20, 0x7f). Permissive on the printable range to
preserve legitimate ids (DEAL|2064, TASKS_X, ĐƠN_HÀNG). Rejected
payloads emit security.bitrix24.entity_metadata_rejected with lengths,
no payload contents.

Addresses MEDIUM finding from PR nextlevelbuilder#1057 review.
When goclaw is upgraded but `migrate up` hasn't run, BootstrapPortals
fails with "relation bitrix_portals does not exist" buried in a generic
"bootstrap failed" log. Operators miss it and Bitrix24 channels
silently no-op.

Detect the missing-table case (PG + SQLite text variants) and emit a
distinct warning naming migration 000058 + the remediation command, so
the next `grep migrate` from a confused operator finds it immediately.

Addresses MEDIUM finding from PR nextlevelbuilder#1057 review.
@tech-synity
Copy link
Copy Markdown
Author

Cảm ơn review chi tiết! Đã fix MEDIUM trong 2 commits mới push:

1936ca9fix(bitrix24): sanitize webhook entity metadata before system-prompt interpolation

  • Thêm isSafeBitrixEntityToken (cap length 64/128 + reject control chars < 0x20, 0x7f)
  • Reject case emit security.bitrix24.entity_metadata_rejected (chỉ log lengths, không log payload)
  • 14 test cases pass (kể cả prompt-injection attempt thực tế: "2064\n\n## SYSTEM: ignore prior")

8c29149fix(bitrix24): explicit migration-missing warning in BootstrapPortals

  • Detect riêng case "relation bitrix_portals does not exist" (PG + SQLite variant)
  • Log warning rõ tên migration 000058 + lệnh goclaw migrate up để operator không phải grep log

Về resolveActorUserID: helper này đã ở internal/agent/loop_mcp_user.go (channel-agnostic) ngay từ commit 2597badc — không nằm trong bitrix24/ package. Mình hiểu là review point dựa trên PR description nên ghi nhầm. Nếu sau này Telegram/Slack thêm per-user OAuth, helper sẵn sàng dùng chung.

Về split PR + CI: nhận, đang prep.

  • Split: sẽ tách thành 3 PRs (Infra → Store+Migration+CLI → Channel core+UI) trong vài giờ tới. Convert PR này thành Draft trong lúc thao tác.
  • CI: branch tech-synity/feat/bitrix24-channel chưa trigger workflows do fork chưa enable Actions — sẽ enable trong fork settings, hoặc nếu maintainer prefer thì mở PRs mới sẽ tự chạy upstream ci.yaml.

Sẽ ping lại khi 3 PRs mới ready. Cảm ơn anh.

@tech-synity
Copy link
Copy Markdown
Author

Closing in favor of split per maintainer review. 3 stacked PRs (merge in order):

MEDIUM items from this PR's review addressed:

  • ✅ Prompt-injection cap on entity metadata — committed in PR 3 with 14 unit tests
  • ✅ Migration-missing explicit warning — committed in PR 3
  • ℹ️ resolveActorUserID move — already lives in internal/agent/loop_mcp_user.go (channel-agnostic from commit 2597bad); not in bitrix24/ package

Cảm ơn anh review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants