Skip to content

Add smallFastModel configuration for lightweight tasks #2791

@tanzhenxin

Description

@tanzhenxin

What would you like to be added?

A configuration option to specify a "small/fast" model (e.g., smallFastModel or flashModel) that Qwen Code can automatically use for lightweight, low-stakes tasks, while keeping the main model for complex reasoning and code generation.

The idea is to use a smaller, faster model for structured, low-complexity tasks like:

  • Generating commit messages or branch names
  • Creating short titles/summaries for sessions
  • Simple parsing or extraction tasks
  • Quick verification or classification
  • Tool call routing decisions

Why is this needed?

Speed and cost efficiency: Many internal operations don't require the full capability of a large model. Using a smaller, faster model for these tasks would:

  • Reduce latency for simple operations (sub-second responses vs. waiting for a large model)
  • Lower token costs for routine tasks
  • Improve overall user experience by not bottlenecking simple operations on a heavy model

Clear pattern from industry: This is a well-established pattern — use a small, fast model for lightweight, structured, low-stakes tasks where speed and cost matter more than raw capability. Reserve the main model for complex reasoning, code generation, and multi-step problem solving.

Additional context

Current state: Qwen Code uses a single model for all tasks. There's no concept of automatic model routing based on task type. The closest existing mechanisms are:

  • Subagent model selection (manual per-agent configuration)
  • The /model command (manual user switching)

Proposed behavior:

  1. Add a smallFastModel setting in settings.json (optional — if not set, fall back to the main model)
  2. Internally route lightweight tasks (titles, summaries, parsing, verification) to the small fast model
  3. Keep the main model for conversation, code generation, and complex reasoning

Example configuration:

{
  "model": "qwen3.5-plus",
  "smallFastModel": "qwen-turbo"
}

Or via environment variable:

export QWEN_SMALL_FAST_MODEL=qwen-turbo

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions