fix: make concurrency key per-task to prevent stream collision between same-type background agents#3099
Conversation
…n same-type agents
|
All contributors have signed the CLA. Thank you! ✅ |
|
I have read the CLA Document and I hereby sign the CLA |
There was a problem hiding this comment.
1 issue found across 1 file
Confidence score: 3/5
- There is a concrete regression risk in
src/features/background-agent/manager.ts: appendingtaskIdto the concurrency key preventsConcurrencyManagerfrom matchingmodelConcurrencyentries, which can effectively bypass intended per-model throttling. - Given the medium severity (6/10) and solid confidence (7/10), this is more than a minor cleanup issue and could impact runtime behavior under load, so merge risk is moderate until corrected.
- Pay close attention to
src/features/background-agent/manager.ts- concurrency key construction appears to break per-model limit lookups and task counting.
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="src/features/background-agent/manager.ts">
<violation number="1" location="src/features/background-agent/manager.ts:631">
P2: Appending taskId to the concurrency key breaks per-model concurrency limits because ConcurrencyManager uses the full key string for limit lookups and counts, so modelConcurrency entries no longer match and each task gets its own slot.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| private getConcurrencyKeyFromInput(input: LaunchInput): string { | ||
| if (input.model) { | ||
| return `${input.model.providerID}/${input.model.modelID}` | ||
| return `${input.model.providerID}/${input.model.modelID}/${input.taskId}` |
There was a problem hiding this comment.
P2: Appending taskId to the concurrency key breaks per-model concurrency limits because ConcurrencyManager uses the full key string for limit lookups and counts, so modelConcurrency entries no longer match and each task gets its own slot.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/features/background-agent/manager.ts, line 631:
<comment>Appending taskId to the concurrency key breaks per-model concurrency limits because ConcurrencyManager uses the full key string for limit lookups and counts, so modelConcurrency entries no longer match and each task gets its own slot.</comment>
<file context>
@@ -628,9 +628,9 @@ export class BackgroundManager {
private getConcurrencyKeyFromInput(input: LaunchInput): string {
if (input.model) {
- return `${input.model.providerID}/${input.model.modelID}`
+ return `${input.model.providerID}/${input.model.modelID}/${input.taskId}`
}
- return input.agent
</file context>
There was a problem hiding this comment.
Good catch — you're right that appending taskId to the concurrency key effectively bypasses per-model concurrency limits set via modelConcurrency, since each task now gets its own unique key and therefore its own slot.
However, the current behavior (using only providerID/modelID as the key) causes a critical bug: when two or more background agents share the same model, they end up sharing the same concurrency lane. This leads to provider-level stream collisions where the first agent's inference stream is silently killed mid-response when the second agent starts streaming. The result is truncated/corrupted output with no error — the agent just stops producing tokens.
This fix prioritizes correctness (each agent gets an independent stream) over concurrency limiting. I'm aware this is a tradeoff and I don't currently have a solution that preserves both behaviors — independent streams per task AND per-model concurrency limits.
A proper fix would likely require separating the two concerns: one key for stream isolation (per-task) and a separate mechanism for concurrency counting (per-model). I'd welcome suggestions from the maintainers on how to best approach this.
Summary
When launching 2+ background agents of the same type simultaneously (e.g., two Oracle agents), the first agent's stream is always silently killed mid-word during the thinking/reasoning process. The second agent completes normally. This only happens with same-type agents — launching agents of different types (e.g., Oracle + Explore) works perfectly.
Tested against version v3.14.0 (compiled
dist/index.js).Root Cause
getConcurrencyKeyFromInput()insrc/features/background-agent/manager.ts(line 629) generates the concurrency key based on model identity (providerID/modelID):Two agents of the same type resolve to the same model, producing identical concurrency keys. This causes them to:
queuesByKeyConcurrencyManagerThe shared concurrency key creates a collision at the provider/transport level — when the second agent's streaming request arrives at the same endpoint, the first agent's active stream is silently terminated (no error, no abort signal — the stream just stops mid-token).
Why it only affects same-type agents
provider/model-Xprovider/model-Xprovider/model-Xprovider/model-YDifferent agent types resolve to different models, producing independent concurrency keys. Same-type agents resolve to the same model, creating the collision.
Fix
Append the unique
taskIdto the concurrency key, ensuring every task gets its own independent concurrency lane:This eliminates the shared-key collision while preserving the
ConcurrencyManagersemaphore architecture (which still enforces per-model limits via theproviderID/modelIDprefix).Testing
Additional Findings (not included in this PR)
During the investigation of this bug, several other timeout/kill mechanisms were identified in
dist/index.js(v3.14.0) that can prematurely terminate background agents. These are documented here for awareness:Timeout constants that are too aggressive for long-running agents
TASK_TTL_MSconstants.tsTERMINAL_TASK_TTL_MSconstants.tsSESSION_TTL_MSauto-retry.tsDEFAULT_POLL_TIMEOUT_MStiming.tsDEFERRED_SESSION_TTL_MStmux-subagent/manager.tsDEFAULT_STALE_TIMEOUT_MSconstants.tsDEFAULT_MESSAGE_STALENESS_TIMEOUT_MSconstants.tsThese timeouts are reasonable for typical usage but can cause issues with agents that perform deep analysis (Oracle, Metis) or complex multi-step implementations (Hephaestus) that may run for extended periods. Users can override most of these via
oh-my-opencode.jsonconfig underbackground_task.Circuit breaker
maxToolCallsdefaultThe
DEFAULT_MAX_TOOL_CALLSconstant (4000) is the upper bound, but users may inadvertently configure a much lower value (e.g., 200) in theiroh-my-opencode.jsonunderbackground_task.maxToolCalls. At ~2.8 seconds per tool call, a value of 200 kills agents after approximately 9 minutes. The default of 4000 is appropriate; the issue is that there's no validation or warning when users set this too low.Applies to: v3.14.0 (
dist/index.jsline 629 in compiled output,src/features/background-agent/manager.tsline 629 in source)Summary by cubic
Make concurrency keys unique per task to prevent stream collisions when running multiple same-type background agents. This stops the first agent’s stream from being cut mid-output and lets concurrent runs finish.
getConcurrencyKeyFromInputto appendtaskIdfor both model and agent keys (providerID/modelID/taskIdandagent/taskId).Written for commit a4c598b. Summary will update on new commits.