Skip to content

Commit 4daf7f9

Browse files
feat(core): add microcompaction for idle context cleanup (#3006)
* feat(core): add microcompaction for idle context cleanup Clear old tool result content from chat history when the user returns after an idle period (default 60 min). Replaces functionResponse output with a sentinel string for compactable tools (read_file, shell, grep, glob, web_fetch, web_search, edit, write_file), keeping the N most recent results intact (default 5). Runs before full compression so it can shed tokens cheaply without an API call. - Time-based trigger reuses lastApiCompletionTimestamp from thinking cleanup - Per-part counting so keepRecent applies to individual tool results even when batched in parallel - Preserves tool error responses (only clears successful outputs) - Configurable via settings.json (context.microcompaction) with env var overrides for E2E testing - Enabled by default * refactor(config): unify idle cleanup settings under clearContextOnIdle Consolidate thinking block cleanup and tool results microcompaction config into a single `context.clearContextOnIdle` settings group: { "context": { "clearContextOnIdle": { "thinkingThresholdMinutes": 5, "toolResultsThresholdMinutes": 60, "toolResultsNumToKeep": 5 } } } - Use -1 on either threshold to disable that cleanup (no enabled bool) - Remove separate `microcompaction` and `gapThresholdMinutes` settings - Thinking cleanup: 5 min default (unchanged) - Tool results cleanup: 60 min default - Preserve tool error responses (only clear successful outputs) * feat(vscode-ide-companion): add clearContextOnIdle settings configuration - Add gapThresholdMinutes settings for thinking blocks, tool results, and retention count - Remove deprecated gapThresholdMinutes from root settings level This reorganizes the context clearing settings into a dedicated clearContextOnIdle object with configurable thresholds for thinking blocks and tool results. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(core): restrict microcompaction to user-initiated messages only Move microcompactHistory() inside the UserQuery/Cron guard so model latency during tool-call loops doesn't count as user idle time. * docs: update settings docs for clearContextOnIdle config rename Replace stale `context.gapThresholdMinutes` entry with the new `context.clearContextOnIdle.*` settings group introduced in the microcompaction feature. * fix(core): address review comments on microcompaction PR - Guard against NaN in toolResultsNumToKeep with Number.isFinite() - Report effective keepRecent (after Math.max) in meta, not raw config - Fix comment to mention cron messages alongside user messages --------- Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
1 parent 34d560a commit 4daf7f9

File tree

9 files changed

+743
-40
lines changed

9 files changed

+743
-40
lines changed

docs/users/configuration/settings.md

Lines changed: 13 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -210,17 +210,19 @@ The `extra_body` field allows you to add custom parameters to the request body s
210210

211211
#### context
212212

213-
| Setting | Type | Description | Default |
214-
| ------------------------------------------------- | -------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------- |
215-
| `context.fileName` | string or array of strings | The name of the context file(s). | `undefined` |
216-
| `context.importFormat` | string | The format to use when importing memory. | `undefined` |
217-
| `context.includeDirectories` | array | Additional directories to include in the workspace context. Specifies an array of additional absolute or relative paths to include in the workspace context. Missing directories will be skipped with a warning by default. Paths can use `~` to refer to the user's home directory. This setting can be combined with the `--include-directories` command-line flag. | `[]` |
218-
| `context.loadFromIncludeDirectories` | boolean | Controls the behavior of the `/memory refresh` command. If set to `true`, `QWEN.md` files should be loaded from all directories that are added. If set to `false`, `QWEN.md` should only be loaded from the current directory. | `false` |
219-
| `context.fileFiltering.respectGitIgnore` | boolean | Respect .gitignore files when searching. | `true` |
220-
| `context.fileFiltering.respectQwenIgnore` | boolean | Respect .qwenignore files when searching. | `true` |
221-
| `context.fileFiltering.enableRecursiveFileSearch` | boolean | Whether to enable searching recursively for filenames under the current tree when completing `@` prefixes in the prompt. | `true` |
222-
| `context.fileFiltering.enableFuzzySearch` | boolean | When `true`, enables fuzzy search capabilities when searching for files. Set to `false` to improve performance on projects with a large number of files. | `true` |
223-
| `context.gapThresholdMinutes` | number | Minutes of inactivity after which retained thinking blocks are cleared to free context tokens. Aligns with typical provider prompt-cache TTL. Set higher if your provider has a longer cache TTL. | `5` |
213+
| Setting | Type | Description | Default |
214+
| -------------------------------------------------------- | -------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------- |
215+
| `context.fileName` | string or array of strings | The name of the context file(s). | `undefined` |
216+
| `context.importFormat` | string | The format to use when importing memory. | `undefined` |
217+
| `context.includeDirectories` | array | Additional directories to include in the workspace context. Specifies an array of additional absolute or relative paths to include in the workspace context. Missing directories will be skipped with a warning by default. Paths can use `~` to refer to the user's home directory. This setting can be combined with the `--include-directories` command-line flag. | `[]` |
218+
| `context.loadFromIncludeDirectories` | boolean | Controls the behavior of the `/memory refresh` command. If set to `true`, `QWEN.md` files should be loaded from all directories that are added. If set to `false`, `QWEN.md` should only be loaded from the current directory. | `false` |
219+
| `context.fileFiltering.respectGitIgnore` | boolean | Respect .gitignore files when searching. | `true` |
220+
| `context.fileFiltering.respectQwenIgnore` | boolean | Respect .qwenignore files when searching. | `true` |
221+
| `context.fileFiltering.enableRecursiveFileSearch` | boolean | Whether to enable searching recursively for filenames under the current tree when completing `@` prefixes in the prompt. | `true` |
222+
| `context.fileFiltering.enableFuzzySearch` | boolean | When `true`, enables fuzzy search capabilities when searching for files. Set to `false` to improve performance on projects with a large number of files. | `true` |
223+
| `context.clearContextOnIdle.thinkingThresholdMinutes` | number | Minutes of inactivity before clearing old thinking blocks to free context tokens. Aligns with typical provider prompt-cache TTL. Use `-1` to disable. | `5` |
224+
| `context.clearContextOnIdle.toolResultsThresholdMinutes` | number | Minutes of inactivity before clearing old tool result content. Use `-1` to disable. | `60` |
225+
| `context.clearContextOnIdle.toolResultsNumToKeep` | number | Number of most-recent compactable tool results to preserve when clearing. Floor at 1. | `5` |
224226

225227
#### Troubleshooting File Search Performance
226228

packages/cli/src/config/config.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1068,8 +1068,8 @@ export async function loadCliConfig(
10681068
},
10691069
telemetry: telemetrySettings,
10701070
usageStatisticsEnabled: settings.privacy?.usageStatisticsEnabled ?? true,
1071+
clearContextOnIdle: settings.context?.clearContextOnIdle,
10711072
fileFiltering: settings.context?.fileFiltering,
1072-
thinkingIdleThresholdMinutes: settings.context?.gapThresholdMinutes,
10731073
checkpointing:
10741074
argv.checkpointing || settings.general?.checkpointing?.enabled,
10751075
proxy:

packages/cli/src/config/settingsSchema.ts

Lines changed: 42 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -886,6 +886,48 @@ const SETTINGS_SCHEMA = {
886886
description: 'Whether to load memory files from include directories.',
887887
showInDialog: false,
888888
},
889+
clearContextOnIdle: {
890+
type: 'object',
891+
label: 'Clear Context On Idle',
892+
category: 'Context',
893+
requiresRestart: false,
894+
default: {},
895+
description:
896+
'Settings for clearing stale context after idle periods. Use -1 to disable a threshold.',
897+
showInDialog: false,
898+
properties: {
899+
thinkingThresholdMinutes: {
900+
type: 'number',
901+
label: 'Thinking Idle Threshold (minutes)',
902+
category: 'Context',
903+
requiresRestart: false,
904+
default: 5 as number,
905+
description:
906+
'Minutes of inactivity before clearing old thinking blocks. Use -1 to disable.',
907+
showInDialog: false,
908+
},
909+
toolResultsThresholdMinutes: {
910+
type: 'number',
911+
label: 'Tool Results Idle Threshold (minutes)',
912+
category: 'Context',
913+
requiresRestart: false,
914+
default: 60 as number,
915+
description:
916+
'Minutes of inactivity before clearing old tool result content. Use -1 to disable.',
917+
showInDialog: false,
918+
},
919+
toolResultsNumToKeep: {
920+
type: 'number',
921+
label: 'Tool Results Number To Keep',
922+
category: 'Context',
923+
requiresRestart: false,
924+
default: 5 as number,
925+
description:
926+
'Number of most-recent compactable tool results to preserve when clearing. Floor at 1.',
927+
showInDialog: false,
928+
},
929+
},
930+
},
889931
fileFiltering: {
890932
type: 'object',
891933
label: 'File Filtering',
@@ -933,16 +975,6 @@ const SETTINGS_SCHEMA = {
933975
},
934976
},
935977
},
936-
gapThresholdMinutes: {
937-
type: 'number',
938-
label: 'Thinking Block Idle Threshold (minutes)',
939-
category: 'Context',
940-
requiresRestart: false,
941-
default: 5,
942-
description:
943-
'Minutes of inactivity after which retained thinking blocks are cleared to free context tokens. Aligns with provider prompt-cache TTL.',
944-
showInDialog: false,
945-
},
946978
},
947979
},
948980

packages/core/src/config/config.ts

Lines changed: 25 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -209,6 +209,19 @@ export interface ChatCompressionSettings {
209209
contextPercentageThreshold?: number;
210210
}
211211

212+
/**
213+
* Settings for clearing stale context after idle periods.
214+
* Threshold values of -1 mean "never clear" (disabled).
215+
*/
216+
export interface ClearContextOnIdleSettings {
217+
/** Minutes idle before clearing old thinking blocks. Default 5. Use -1 to disable. */
218+
thinkingThresholdMinutes?: number;
219+
/** Minutes idle before clearing old tool results. Default 60. Use -1 to disable. */
220+
toolResultsThresholdMinutes?: number;
221+
/** Number of most-recent tool results to preserve. Default 5. */
222+
toolResultsNumToKeep?: number;
223+
}
224+
212225
export interface TelemetrySettings {
213226
enabled?: boolean;
214227
target?: TelemetryTarget;
@@ -371,8 +384,7 @@ export interface ConfigParameters {
371384
model?: string;
372385
outputLanguageFilePath?: string;
373386
maxSessionTurns?: number;
374-
/** Minutes of inactivity before clearing retained thinking blocks. */
375-
thinkingIdleThresholdMinutes?: number;
387+
clearContextOnIdle?: ClearContextOnIdleSettings;
376388
sessionTokenLimit?: number;
377389
experimentalZedIntegration?: boolean;
378390
cronEnabled?: boolean;
@@ -561,7 +573,7 @@ export class Config {
561573
private ideMode: boolean;
562574

563575
private readonly maxSessionTurns: number;
564-
private readonly thinkingIdleThresholdMs: number;
576+
private readonly clearContextOnIdle: ClearContextOnIdleSettings;
565577
private readonly sessionTokenLimit: number;
566578
private readonly listExtensions: boolean;
567579
private readonly overrideExtensions?: string[];
@@ -688,8 +700,14 @@ export class Config {
688700
this.fileDiscoveryService = params.fileDiscoveryService ?? null;
689701
this.bugCommand = params.bugCommand;
690702
this.maxSessionTurns = params.maxSessionTurns ?? -1;
691-
this.thinkingIdleThresholdMs =
692-
(params.thinkingIdleThresholdMinutes ?? 5) * 60 * 1000;
703+
this.clearContextOnIdle = {
704+
thinkingThresholdMinutes:
705+
params.clearContextOnIdle?.thinkingThresholdMinutes ?? 5,
706+
toolResultsThresholdMinutes:
707+
params.clearContextOnIdle?.toolResultsThresholdMinutes ?? 60,
708+
toolResultsNumToKeep:
709+
params.clearContextOnIdle?.toolResultsNumToKeep ?? 5,
710+
};
693711
this.sessionTokenLimit = params.sessionTokenLimit ?? -1;
694712
this.experimentalZedIntegration =
695713
params.experimentalZedIntegration ?? false;
@@ -1336,8 +1354,8 @@ export class Config {
13361354
return this.maxSessionTurns;
13371355
}
13381356

1339-
getThinkingIdleThresholdMs(): number {
1340-
return this.thinkingIdleThresholdMs;
1357+
getClearContextOnIdle(): ClearContextOnIdleSettings {
1358+
return this.clearContextOnIdle;
13411359
}
13421360

13431361
getSessionTokenLimit(): number {

packages/core/src/core/client.test.ts

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -323,7 +323,11 @@ describe('Gemini Client (client.ts)', () => {
323323
getWorkingDir: vi.fn().mockReturnValue('/test/dir'),
324324
getFileService: vi.fn().mockReturnValue(fileService),
325325
getMaxSessionTurns: vi.fn().mockReturnValue(0),
326-
getThinkingIdleThresholdMs: vi.fn().mockReturnValue(5 * 60 * 1000),
326+
getClearContextOnIdle: vi.fn().mockReturnValue({
327+
thinkingThresholdMinutes: 5,
328+
toolResultsThresholdMinutes: 60,
329+
toolResultsNumToKeep: 5,
330+
}),
327331
getSessionTokenLimit: vi.fn().mockReturnValue(32000),
328332
getNoBrowser: vi.fn().mockReturnValue(false),
329333
getUsageStatisticsEnabled: vi.fn().mockReturnValue(true),

packages/core/src/core/client.ts

Lines changed: 26 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ import type {
1616
// Config
1717
import { ApprovalMode, type Config } from '../config/config.js';
1818
import { createDebugLogger } from '../utils/debugLogger.js';
19+
import { microcompactHistory } from '../services/microcompaction/microcompact.js';
1920

2021
const debugLogger = createDebugLogger('CLIENT');
2122

@@ -561,15 +562,16 @@ export class GeminiClient {
561562
// record user message for session management
562563
this.config.getChatRecordingService()?.recordUserMessage(request);
563564

564-
// Thinking block cross-turn retention with idle cleanup:
565-
// - Active session (< threshold idle): keep thinking blocks for reasoning coherence
566-
// - Idle > threshold: clear old thinking, keep only last 1 turn to free context
567-
// - Latch: once triggered, never revert — prevents oscillation
565+
// Idle cleanup: clear stale thinking blocks after idle period.
566+
// Latch: once triggered, never revert — prevents oscillation.
567+
const idleConfig = this.config.getClearContextOnIdle();
568+
const thinkingThresholdMin = idleConfig.thinkingThresholdMinutes ?? 5;
568569
if (
570+
thinkingThresholdMin >= 0 &&
569571
!this.thinkingClearLatched &&
570572
this.lastApiCompletionTimestamp !== null
571573
) {
572-
const thresholdMs = this.config.getThinkingIdleThresholdMs();
574+
const thresholdMs = thinkingThresholdMin * 60 * 1000;
573575
const idleMs = Date.now() - this.lastApiCompletionTimestamp;
574576
if (idleMs > thresholdMs) {
575577
this.thinkingClearLatched = true;
@@ -582,6 +584,25 @@ export class GeminiClient {
582584
this.getChat().stripThoughtsFromHistoryKeepRecent(1);
583585
debugLogger.debug('Stripped old thinking blocks (keeping last 1 turn)');
584586
}
587+
588+
// Idle cleanup: clear old tool results when idle > threshold.
589+
// Runs on user and cron messages (not tool result submissions or
590+
// retries/hooks) so that model latency during a tool-call loop
591+
// doesn't count as user idle time.
592+
const mcResult = microcompactHistory(
593+
this.getChat().getHistory(),
594+
this.lastApiCompletionTimestamp,
595+
this.config.getClearContextOnIdle(),
596+
);
597+
if (mcResult.meta) {
598+
this.getChat().setHistory(mcResult.history);
599+
const m = mcResult.meta;
600+
debugLogger.debug(
601+
`[TIME-BASED MC] gap ${m.gapMinutes}min > ${m.thresholdMinutes}min, ` +
602+
`cleared ${m.toolsCleared} tool results (~${m.tokensSaved} tokens), ` +
603+
`kept last ${m.toolsKept}`,
604+
);
605+
}
585606
}
586607
if (messageType !== SendMessageType.Retry) {
587608
this.sessionTurnCount++;

0 commit comments

Comments
 (0)