fix: sticky session binding breaks on streaming path + multi-user isolation with body.user#188
Open
The-five-stooges wants to merge 14 commits into
Open
fix: sticky session binding breaks on streaming path + multi-user isolation with body.user#188The-five-stooges wants to merge 14 commits into
The-five-stooges wants to merge 14 commits into
Conversation
…lusively when present
extractBodyCallerSubKey concatenates multiple body fields (user, metadata.session_id, conversation_id, etc.) into a single hash via candidates.join('|'). When a proxy script injects body.user='user_a', but Claude Code carries varying metadata.session_id across turns, the concatenated hash changes producing a different callerKey on every request. The sticky session manager then sees a 'new' caller each time and re-rolls the account binding, defeating the purpose of STICKY_SESSION_ENABLED and CASCADE_REUSE_BY_CALLER. This fix short-circuits: when body.user is present, use ONLY its sha256 hash as the caller subkey, discarding all other metadata fields. The multi-field fallback still applies when body.user is absent (native Claude Code / no proxy), preserving backward compatibility.
…orage
The exportLogs() function reads the dashboard password from sessionStorage.getItem('dashboard_password'), but the login flow writes it to this.password (backed by localStorage.getItem('dp')). sessionStorage.dashboard_password is never set anywhere in the codebase, so it is always empty causing every log download request to fail with HTTP 401. This patch changes the source to this.password, which is reliably populated after a successful login and survives page reloads via localStorage.
…er injection Add a fully documented reference proxy script that injects a per-user 'user' field into chat completion requests before forwarding to WindsurfAPI. This enables two or more developers sharing a single WindsurfAPI instance to maintain isolated sticky sessions and independent upstream account bindings. The script includes: problem statement, prerequisites (STICKY_SESSION_ENABLED + CASCADE_REUSE_BY_CALLER + independent LS instances via tinyproxy), step-by-step usage guide, systemd service example, verification steps, and security notes. All examples use placeholder values no real credentials, IPs, or account emails are included.
…nding When STICKY_SESSION_ENABLED=1, the sticky binding key is callerKey + modelKey. This means a single user requesting different models (e.g. opus vs haiku) may get bound to different upstream accounts because modelKey changes. Add a dashboard experimental toggle 'stickyBindByUserOnly' that, when enabled, forces the binding key to callerKey + '*' regardless of model, so all model requests from the same user share one upstream account. Default OFF (preserves per-model isolation). Debug log in caller-key.js reports body.user/subKey for troubleshooting.
Add [sticky] HIT/MISS/SET/CLEAR log entries (filtered to callerKeys containing ':user:') so the Dashboard log panel can show the full lifecycle of user-scoped bindings.
… already injected body.user messages.js extractCallerSubKey() reads metadata.user_id (Claude Code internal device/session id) and appends it as a second ':user:xxxx' segment to the callerKey. When a proxy script has already injected body.user (resulting in callerKey ending with ':user:<hash>'), this double-stamp changes the callerKey on sub-agent calls where metadata.user_id differs causing sticky session MISS and cross-account binding. Fix: skip the append when callerKey already contains ':user:'.
…erations
Remove the callerKey.includes(':user:') filter from all [sticky] log entries so every sticky operation is visible in dashboard logs. Also add a SKIP log when callerKey is empty/null to detect code paths that bypass sticky entirely.
…yBinding Even with the ENABLED/callerKey guards removed from filter conditions, the haiku sub-agent request hitting 838591845@qq.com shows NO [sticky] log at all (not even MISS). This unconditional ENTER log at the very top of getStickyBinding will confirm whether the function is actually being invoked for every request.
…ranch Add logging at the getApiKey entry point to definitively determine whether sticky session is being skipped because callerKey is falsy or isStickyEnabled() returns false. Also add SCHECK log in the else branch for clarity.
The streaming retry loop at _handleChatCompletionsInner (~L2926) calls waitForAccountFn with only 4 arguments, omitting the 5th callerKey parameter. This causes waitForAccount to default callerKey=null, which makes getApiKey skip the sticky session check (callerKey && isStickyEnabled() null && true falsy). As a result, sticky session bindings are never looked up for new conversations on the streaming path every first request re-rolls a random account. The non-streaming path at L1868 correctly passes callerKey. This one-line fix restores parity.
…rotation on bound account failure When stickyNoFallback is enabled (Dashboard toggle): - auth.js getApiKey(): bound account unavailable return null immediately, don't clear binding, don't fall through to normal selection - chat.js waitForAccount(): skip 30s wait loop, fail fast - chat.js non-stream retry: rate_limit/transient/model_not_available from sticky account break instead of continue - chat.js stream retry: same pattern, break on any model error This ensures each user strictly consumes only their bound account's quota instead of burning through other accounts in the pool.
…ody.user for POST/PUT/PATCH Previously the proxy scripts hardcoded method:'POST' for all upstream requests, which broke GET /v1/models responses with a 404 on proxied ports. GET/HEAD/DELETE requests are now forwarded as-is (no body injection), while POST/PUT/PATCH still inject body.user for multi-user isolation. - scripts/proxy-user-mantou.js :3005, injects user='mantou' - scripts/proxy-user-maskja.js :3004, injects user='maskja' - examples/proxy-user-inject.js reference example updated to match
- Level 1 (stickyBindByUserOnly + stickyNoFallback): deterministic SHA256(callerKey) sharding, ignoring all health metrics - Level 2 (default): soft sharding only when top two candidates are perfectly tied Fixes cross-user account contamination when multiple users share the same pool
Author
|
新增提交 问题两个用户(maskja 端口 9090、antimantou 端口 8080)共享同一账号池时, 解决方案在
效果开启严格锁定后,maskja 和 antimantou 各自固定使用不同的上游账号,不再串号。 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
概要
修复了粘性会话(sticky session)在多用户代理场景下的两个关键问题,并新增了 Dashboard 可配置项和示例脚本。
修改内容
🐛 Bug 修复
src/handlers/chat.js— 核心修复:流式重试路径waitForAccountFn调用缺少第 5 个参数callerKey,导致auth.js中callerKey=null,粘性会话绑定完全失效。补上参数后链路正常。src/handlers/messages.js— 修复 callerKey 的:user:段重复追加问题(当代理脚本已注入body.user时,messages.js 会二次追加)。增加检测逻辑,已含:user:时跳过追加。src/dashboard/index.html— 修复 Dashboard 下载日志 401 错误:从空的sessionStorage读取密码改为使用组件自身的this.password。examples/proxy-user-inject.js— 修复代理示例脚本硬编码method: 'POST'导致GET /v1/models返回 404 的问题。改为保留原始 HTTP 方法,仅对 POST/PUT/PATCH 注入body.user。✨ 新功能
src/account/sticky-session.js— 新增stickyBindByUserOnly开关支持:*,使同一 user 下所有模型共享同一个上游账号callerKey + modelKey二元组绑定(原有行为)src/account/sticky-session.js+src/auth.js+src/handlers/chat.js— 新增stickyNoFallback开关:auth.jsgetApiKey 直接返回 null、chat.jswaitForAccount 快速失败、chat.js重试循环 break 而非 continuesrc/runtime-config.js— 注册stickyBindByUserOnly和stickyNoFallback实验性开关,默认均为falsesrc/caller-key.js— 当body.user存在时,extractBodyCallerSubKey只用body.user哈希,不再拼接其他 metadata 字段,确保 callerKey 稳定src/dashboard/index.html— Dashboard 面板新增"粘性会话按用户绑定"和"粘性会话禁止回退"两个 toggle 开关📋 调试日志
src/auth.js、sticky-session.js、caller-key.js— 添加log.info()级别的[sticky]和[caller-key]诊断日志,可在面板实时查看📄 文档
examples/proxy-user-inject.js— 完整的多用户隔离代理脚本示例,包含问题说明、使用指南、systemd 示例和安全注意事项.gitignore— 添加PR-FLOW.md测试验证
[sticky] CHECK不再出现callerKey=nullstickyBindByUserOnly=true时 binding key 末段为*(忽略模型维度)stickyNoFallback=true时绑定账号失败直接返回错误,不切换账号消耗额度GET /v1/models通过 3004/3005 端口正常返回 200