feat(core): add microcompaction for idle context cleanup (#3006)

tanzhenxin · qwencoder · web-flow · commit 4daf7f935372 · 2026-04-13T18:51:35.000+08:00
* feat(core): add microcompaction for idle context cleanup

Clear old tool result content from chat history when the user returns
after an idle period (default 60 min). Replaces functionResponse output
with a sentinel string for compactable tools (read_file, shell, grep,
glob, web_fetch, web_search, edit, write_file), keeping the N most
recent results intact (default 5). Runs before full compression so it
can shed tokens cheaply without an API call.

- Time-based trigger reuses lastApiCompletionTimestamp from thinking cleanup
- Per-part counting so keepRecent applies to individual tool results
  even when batched in parallel
- Preserves tool error responses (only clears successful outputs)
- Configurable via settings.json (context.microcompaction) with env var
  overrides for E2E testing
- Enabled by default

* refactor(config): unify idle cleanup settings under clearContextOnIdle

Consolidate thinking block cleanup and tool results microcompaction
config into a single `context.clearContextOnIdle` settings group:

  {
    "context": {
      "clearContextOnIdle": {
        "thinkingThresholdMinutes": 5,
        "toolResultsThresholdMinutes": 60,
        "toolResultsNumToKeep": 5
      }
    }
  }

- Use -1 on either threshold to disable that cleanup (no enabled bool)
- Remove separate `microcompaction` and `gapThresholdMinutes` settings
- Thinking cleanup: 5 min default (unchanged)
- Tool results cleanup: 60 min default
- Preserve tool error responses (only clear successful outputs)

* feat(vscode-ide-companion): add clearContextOnIdle settings configuration

- Add gapThresholdMinutes settings for thinking blocks, tool results, and retention count
- Remove deprecated gapThresholdMinutes from root settings level

This reorganizes the context clearing settings into a dedicated clearContextOnIdle object with configurable thresholds for thinking blocks and tool results.

Co-authored-by: Qwen-Coder &lt;qwen-coder@alibabacloud.com&gt;

* fix(core): restrict microcompaction to user-initiated messages only

Move microcompactHistory() inside the UserQuery/Cron guard so model
latency during tool-call loops doesn't count as user idle time.

* docs: update settings docs for clearContextOnIdle config rename

Replace stale `context.gapThresholdMinutes` entry with the new
`context.clearContextOnIdle.*` settings group introduced in the
microcompaction feature.

* fix(core): address review comments on microcompaction PR

- Guard against NaN in toolResultsNumToKeep with Number.isFinite()
- Report effective keepRecent (after Math.max) in meta, not raw config
- Fix comment to mention cron messages alongside user messages

---------

Co-authored-by: Qwen-Coder &lt;qwen-coder@alibabacloud.com&gt;
diff --git a/docs/users/configuration/settings.md b/docs/users/configuration/settings.md
@@ -210,17 +210,19 @@ The `extra_body` field allows you to add custom parameters to the request body s
 
 #### context
 
-| Setting                                           | Type                       | Description                                                                                                                                                                                                                                                                                                                                                           | Default     |
-| ------------------------------------------------- | -------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------- |
-| `context.fileName`                                | string or array of strings | The name of the context file(s).                                                                                                                                                                                                                                                                                                                                      | `undefined` |
-| `context.importFormat`                            | string                     | The format to use when importing memory.                                                                                                                                                                                                                                                                                                                              | `undefined` |
-| `context.includeDirectories`                      | array                      | Additional directories to include in the workspace context. Specifies an array of additional absolute or relative paths to include in the workspace context. Missing directories will be skipped with a warning by default. Paths can use `~` to refer to the user's home directory. This setting can be combined with the `--include-directories` command-line flag. | `[]`        |
-| `context.loadFromIncludeDirectories`              | boolean                    | Controls the behavior of the `/memory refresh` command. If set to `true`, `QWEN.md` files should be loaded from all directories that are added. If set to `false`, `QWEN.md` should only be loaded from the current directory.                                                                                                                                        | `false`     |
-| `context.fileFiltering.respectGitIgnore`          | boolean                    | Respect .gitignore files when searching.                                                                                                                                                                                                                                                                                                                              | `true`      |
-| `context.fileFiltering.respectQwenIgnore`         | boolean                    | Respect .qwenignore files when searching.                                                                                                                                                                                                                                                                                                                             | `true`      |
-| `context.fileFiltering.enableRecursiveFileSearch` | boolean                    | Whether to enable searching recursively for filenames under the current tree when completing `@` prefixes in the prompt.                                                                                                                                                                                                                                              | `true`      |
-| `context.fileFiltering.enableFuzzySearch`         | boolean                    | When `true`, enables fuzzy search capabilities when searching for files. Set to `false` to improve performance on projects with a large number of files.                                                                                                                                                                                                              | `true`      |
-| `context.gapThresholdMinutes`                     | number                     | Minutes of inactivity after which retained thinking blocks are cleared to free context tokens. Aligns with typical provider prompt-cache TTL. Set higher if your provider has a longer cache TTL.                                                                                                                                                                     | `5`         |
+| Setting                                                  | Type                       | Description                                                                                                                                                                                                                                                                                                                                                           | Default     |
+| -------------------------------------------------------- | -------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------- |
+| `context.fileName`                                       | string or array of strings | The name of the context file(s).                                                                                                                                                                                                                                                                                                                                      | `undefined` |
+| `context.importFormat`                                   | string                     | The format to use when importing memory.                                                                                                                                                                                                                                                                                                                              | `undefined` |
+| `context.includeDirectories`                             | array                      | Additional directories to include in the workspace context. Specifies an array of additional absolute or relative paths to include in the workspace context. Missing directories will be skipped with a warning by default. Paths can use `~` to refer to the user's home directory. This setting can be combined with the `--include-directories` command-line flag. | `[]`        |
+| `context.loadFromIncludeDirectories`                     | boolean                    | Controls the behavior of the `/memory refresh` command. If set to `true`, `QWEN.md` files should be loaded from all directories that are added. If set to `false`, `QWEN.md` should only be loaded from the current directory.                                                                                                                                        | `false`     |
+| `context.fileFiltering.respectGitIgnore`                 | boolean                    | Respect .gitignore files when searching.                                                                                                                                                                                                                                                                                                                              | `true`      |
+| `context.fileFiltering.respectQwenIgnore`                | boolean                    | Respect .qwenignore files when searching.                                                                                                                                                                                                                                                                                                                             | `true`      |
+| `context.fileFiltering.enableRecursiveFileSearch`        | boolean                    | Whether to enable searching recursively for filenames under the current tree when completing `@` prefixes in the prompt.                                                                                                                                                                                                                                              | `true`      |
+| `context.fileFiltering.enableFuzzySearch`                | boolean                    | When `true`, enables fuzzy search capabilities when searching for files. Set to `false` to improve performance on projects with a large number of files.                                                                                                                                                                                                              | `true`      |
+| `context.clearContextOnIdle.thinkingThresholdMinutes`    | number                     | Minutes of inactivity before clearing old thinking blocks to free context tokens. Aligns with typical provider prompt-cache TTL. Use `-1` to disable.                                                                                                                                                                                                                 | `5`         |
+| `context.clearContextOnIdle.toolResultsThresholdMinutes` | number                     | Minutes of inactivity before clearing old tool result content. Use `-1` to disable.                                                                                                                                                                                                                                                                                   | `60`        |
+| `context.clearContextOnIdle.toolResultsNumToKeep`        | number                     | Number of most-recent compactable tool results to preserve when clearing. Floor at 1.                                                                                                                                                                                                                                                                                 | `5`         |
 
 #### Troubleshooting File Search Performance
 
diff --git a/packages/cli/src/config/config.ts b/packages/cli/src/config/config.ts
@@ -1068,8 +1068,8 @@ export async function loadCliConfig(
     },
     telemetry: telemetrySettings,
     usageStatisticsEnabled: settings.privacy?.usageStatisticsEnabled ?? true,
+    clearContextOnIdle: settings.context?.clearContextOnIdle,
     fileFiltering: settings.context?.fileFiltering,
-    thinkingIdleThresholdMinutes: settings.context?.gapThresholdMinutes,
     checkpointing:
       argv.checkpointing || settings.general?.checkpointing?.enabled,
     proxy:
diff --git a/packages/cli/src/config/settingsSchema.ts b/packages/cli/src/config/settingsSchema.ts
@@ -886,6 +886,48 @@ const SETTINGS_SCHEMA = {
         description: 'Whether to load memory files from include directories.',
         showInDialog: false,
       },
+      clearContextOnIdle: {
+        type: 'object',
+        label: 'Clear Context On Idle',
+        category: 'Context',
+        requiresRestart: false,
+        default: {},
+        description:
+          'Settings for clearing stale context after idle periods. Use -1 to disable a threshold.',
+        showInDialog: false,
+        properties: {
+          thinkingThresholdMinutes: {
+            type: 'number',
+            label: 'Thinking Idle Threshold (minutes)',
+            category: 'Context',
+            requiresRestart: false,
+            default: 5 as number,
+            description:
+              'Minutes of inactivity before clearing old thinking blocks. Use -1 to disable.',
+            showInDialog: false,
+          },
+          toolResultsThresholdMinutes: {
+            type: 'number',
+            label: 'Tool Results Idle Threshold (minutes)',
+            category: 'Context',
+            requiresRestart: false,
+            default: 60 as number,
+            description:
+              'Minutes of inactivity before clearing old tool result content. Use -1 to disable.',
+            showInDialog: false,
+          },
+          toolResultsNumToKeep: {
+            type: 'number',
+            label: 'Tool Results Number To Keep',
+            category: 'Context',
+            requiresRestart: false,
+            default: 5 as number,
+            description:
+              'Number of most-recent compactable tool results to preserve when clearing. Floor at 1.',
+            showInDialog: false,
+          },
+        },
+      },
       fileFiltering: {
         type: 'object',
         label: 'File Filtering',
@@ -933,16 +975,6 @@ const SETTINGS_SCHEMA = {
           },
         },
       },
-      gapThresholdMinutes: {
-        type: 'number',
-        label: 'Thinking Block Idle Threshold (minutes)',
-        category: 'Context',
-        requiresRestart: false,
-        default: 5,
-        description:
-          'Minutes of inactivity after which retained thinking blocks are cleared to free context tokens. Aligns with provider prompt-cache TTL.',
-        showInDialog: false,
-      },
     },
   },
 
diff --git a/packages/core/src/config/config.ts b/packages/core/src/config/config.ts
@@ -209,6 +209,19 @@ export interface ChatCompressionSettings {
   contextPercentageThreshold?: number;
 }
 
+/**
+ * Settings for clearing stale context after idle periods.
+ * Threshold values of -1 mean "never clear" (disabled).
+ */
+export interface ClearContextOnIdleSettings {
+  /** Minutes idle before clearing old thinking blocks. Default 5. Use -1 to disable. */
+  thinkingThresholdMinutes?: number;
+  /** Minutes idle before clearing old tool results. Default 60. Use -1 to disable. */
+  toolResultsThresholdMinutes?: number;
+  /** Number of most-recent tool results to preserve. Default 5. */
+  toolResultsNumToKeep?: number;
+}
+
 export interface TelemetrySettings {
   enabled?: boolean;
   target?: TelemetryTarget;
@@ -371,8 +384,7 @@ export interface ConfigParameters {
   model?: string;
   outputLanguageFilePath?: string;
   maxSessionTurns?: number;
-  /** Minutes of inactivity before clearing retained thinking blocks. */
-  thinkingIdleThresholdMinutes?: number;
+  clearContextOnIdle?: ClearContextOnIdleSettings;
   sessionTokenLimit?: number;
   experimentalZedIntegration?: boolean;
   cronEnabled?: boolean;
@@ -561,7 +573,7 @@ export class Config {
   private ideMode: boolean;
 
   private readonly maxSessionTurns: number;
-  private readonly thinkingIdleThresholdMs: number;
+  private readonly clearContextOnIdle: ClearContextOnIdleSettings;
   private readonly sessionTokenLimit: number;
   private readonly listExtensions: boolean;
   private readonly overrideExtensions?: string[];
@@ -688,8 +700,14 @@ export class Config {
     this.fileDiscoveryService = params.fileDiscoveryService ?? null;
     this.bugCommand = params.bugCommand;
     this.maxSessionTurns = params.maxSessionTurns ?? -1;
-    this.thinkingIdleThresholdMs =
-      (params.thinkingIdleThresholdMinutes ?? 5) * 60 * 1000;
+    this.clearContextOnIdle = {
+      thinkingThresholdMinutes:
+        params.clearContextOnIdle?.thinkingThresholdMinutes ?? 5,
+      toolResultsThresholdMinutes:
+        params.clearContextOnIdle?.toolResultsThresholdMinutes ?? 60,
+      toolResultsNumToKeep:
+        params.clearContextOnIdle?.toolResultsNumToKeep ?? 5,
+    };
     this.sessionTokenLimit = params.sessionTokenLimit ?? -1;
     this.experimentalZedIntegration =
       params.experimentalZedIntegration ?? false;
@@ -1336,8 +1354,8 @@ export class Config {
     return this.maxSessionTurns;
   }
 
-  getThinkingIdleThresholdMs(): number {
-    return this.thinkingIdleThresholdMs;
+  getClearContextOnIdle(): ClearContextOnIdleSettings {
+    return this.clearContextOnIdle;
   }
 
   getSessionTokenLimit(): number {
diff --git a/packages/core/src/core/client.test.ts b/packages/core/src/core/client.test.ts
@@ -323,7 +323,11 @@ describe('Gemini Client (client.ts)', () => {
       getWorkingDir: vi.fn().mockReturnValue('/test/dir'),
       getFileService: vi.fn().mockReturnValue(fileService),
       getMaxSessionTurns: vi.fn().mockReturnValue(0),
-      getThinkingIdleThresholdMs: vi.fn().mockReturnValue(5 * 60 * 1000),
+      getClearContextOnIdle: vi.fn().mockReturnValue({
+        thinkingThresholdMinutes: 5,
+        toolResultsThresholdMinutes: 60,
+        toolResultsNumToKeep: 5,
+      }),
       getSessionTokenLimit: vi.fn().mockReturnValue(32000),
       getNoBrowser: vi.fn().mockReturnValue(false),
       getUsageStatisticsEnabled: vi.fn().mockReturnValue(true),
diff --git a/packages/core/src/core/client.ts b/packages/core/src/core/client.ts
@@ -16,6 +16,7 @@ import type {
 // Config
 import { ApprovalMode, type Config } from '../config/config.js';
 import { createDebugLogger } from '../utils/debugLogger.js';
+import { microcompactHistory } from '../services/microcompaction/microcompact.js';
 
 const debugLogger = createDebugLogger('CLIENT');
 
@@ -561,15 +562,16 @@ export class GeminiClient {
       // record user message for session management
       this.config.getChatRecordingService()?.recordUserMessage(request);
 
-      // Thinking block cross-turn retention with idle cleanup:
-      // - Active session (< threshold idle): keep thinking blocks for reasoning coherence
-      // - Idle > threshold: clear old thinking, keep only last 1 turn to free context
-      // - Latch: once triggered, never revert — prevents oscillation
+      // Idle cleanup: clear stale thinking blocks after idle period.
+      // Latch: once triggered, never revert — prevents oscillation.
+      const idleConfig = this.config.getClearContextOnIdle();
+      const thinkingThresholdMin = idleConfig.thinkingThresholdMinutes ?? 5;
       if (
+        thinkingThresholdMin >= 0 &&
         !this.thinkingClearLatched &&
         this.lastApiCompletionTimestamp !== null
       ) {
-        const thresholdMs = this.config.getThinkingIdleThresholdMs();
+        const thresholdMs = thinkingThresholdMin * 60 * 1000;
         const idleMs = Date.now() - this.lastApiCompletionTimestamp;
         if (idleMs > thresholdMs) {
           this.thinkingClearLatched = true;
@@ -582,6 +584,25 @@ export class GeminiClient {
         this.getChat().stripThoughtsFromHistoryKeepRecent(1);
         debugLogger.debug('Stripped old thinking blocks (keeping last 1 turn)');
       }
+
+      // Idle cleanup: clear old tool results when idle > threshold.
+      // Runs on user and cron messages (not tool result submissions or
+      // retries/hooks) so that model latency during a tool-call loop
+      // doesn't count as user idle time.
+      const mcResult = microcompactHistory(
+        this.getChat().getHistory(),
+        this.lastApiCompletionTimestamp,
+        this.config.getClearContextOnIdle(),
+      );
+      if (mcResult.meta) {
+        this.getChat().setHistory(mcResult.history);
+        const m = mcResult.meta;
+        debugLogger.debug(
+          `[TIME-BASED MC] gap ${m.gapMinutes}min > ${m.thresholdMinutes}min, ` +
+            `cleared ${m.toolsCleared} tool results (~${m.tokensSaved} tokens), ` +
+            `kept last ${m.toolsKept}`,
+        );
+      }
     }
     if (messageType !== SendMessageType.Retry) {
       this.sessionTurnCount++;
diff --git a/packages/core/src/services/microcompaction/microcompact.test.ts b/packages/core/src/services/microcompaction/microcompact.test.ts
diff --git a/packages/core/src/services/microcompaction/microcompact.ts b/packages/core/src/services/microcompaction/microcompact.ts
diff --git a/packages/vscode-ide-companion/schemas/settings.schema.json b/packages/vscode-ide-companion/schemas/settings.schema.json