Skip to content

bug: search-models.ts --modality image returns vision/input-image models, not just image-generation models #14

@perry-the-pr-reviewer

Description

@perry-the-pr-reviewer

Bug

search-models.ts --modality <modality> filters by checking both input_modalities AND output_modalities:

if (modality) {
  const lowerModality = modality.toLowerCase();
  models = models.filter((m: any) => {
    const inputMods: string[] = m.architecture?.input_modalities ?? [];
    const outputMods: string[] = m.architecture?.output_modalities ?? [];
    return [...inputMods, ...outputMods]
      .map((mod: string) => mod.toLowerCase())
      .includes(lowerModality);
  });
}

So --modality image returns:

  • Models that accept image input (Claude, GPT-4o with vision, etc.)
  • Models that generate image output (Gemini Flash Image, etc.)

These are completely different capabilities. The openrouter-images skill (image generation) points agents to search-models.ts --modality image to find image generation models — but the result set includes all multimodal vision models, not just generators.

Impact

An agent following the openrouter-images decision tree to find an image generation model will get a list polluted with vision LLMs that can't generate images at all. It may pick the wrong model and produce a confusing error.

Fix

Split the filter into --input-modality and --output-modality flags, or add an --generates-image shorthand that specifically checks output_modalities. Update the decision tree in openrouter-models/SKILL.md and the cross-reference in openrouter-images/SKILL.md accordingly.

Reviewed by Perry

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions