[prompt-clustering] Copilot Agent Prompt Clustering — 2026-06-23 #41004

2026-06-23T11:12:41Z

github-actions[bot]
Bot Jun 23, 2026

Summary

Analysis date: 2026-06-23 · Window: 1,000 most-recent Copilot agent PRs (created 2026-06-03 → 06-23) · Clusters: 10 · Overall merge rate: 79.9%

NLP clustering (TF-IDF + K-means, title-weighted ×3) over 1,000 copilot-swe-agent PRs in github/gh-aw. Merge rate is steady (79.9% today vs 79.2% on 06-22). The one persistent drag remains the [WIP] "fix failing CI" auto-dispatch loop — 39 PRs at a 38.5% merge rate, vs 81.5% for every other task type.

Key findings

Test/CI-routing work is the largest and highest-merging bucket. The test / run / failure cluster (193 PRs, 86% merged) covers command-dispatch routing, checkout hardening, and impacted-test tooling — high volume and high success, so the agent handles this domain reliably.
The [WIP] CI-self-fixer is the chronic failure mode. 19 PRs land in a pure wip failing / github actions cluster at 68%, and across all 39 WIP PRs the merge rate is 38.5% (down from 41% on 06-22). These are auto-generated "Fix failing GitHub Actions job" attempts — they iterate little (avg 2.5 commits, 1.2 reviews) and are abandoned more than half the time.
Version-bump / AWF-sync PRs are the riskiest non-WIP work. The awf / version / cli cluster (131 PRs, 75% — lowest of the real-work clusters) carries by far the largest changesets (avg 92 files, vs 35 repo-wide) because lock-artifact regeneration touches every compiled .lock.yml. Big mechanical diffs → more review churn and more abandonment (32/131 closed unmerged).
Cost/credits work splits into two healthy clusters. credits / ai credits (85 PRs, 83%) and aic / usage / token telemetry (90 PRs, 76%) together are ~18% of all activity — AI-cost engineering is now a dominant, mostly-successful theme.

Cluster analysis

Cluster (top terms)	PRs	Merge %	Avg commits	Avg files	Avg reviews
test / run / failure	193	86%	4.0	17	2.8
safe / safe outputs / outputs	132	78%	3.8	21	2.3
awf / version / cli	131	75%	4.5	92	2.9
prompt / md / ambient context	116	78%	2.9	24	1.5
aic / usage / token telemetry	90	76%	4.3	47	2.2
docs / model / models	86	81%	3.3	9	2.2
credits / ai credits / max	85	83%	3.7	51	2.3
pkg / linters / refactor	76	78%	3.3	35	2.0
sdk / driver / permission	72	77%	4.1	23	2.0
wip / failing github actions	19	68%	2.5	18	1.2

Cluster detail, representative PRs & methodology

Representative tasks per cluster

test / run / failure (193, 86%) — Route centralized command dispatches to the triggering PR branch and harden PR checkout runtime #37187 route command dispatches to triggering PR branch · Compile: move checkout-manifest generation to github-script to unblock dynamic checkout.repository expressions #38154 move checkout-manifest gen to github-script · Add coverage-aware impacted-unit-test Makefile targets and run them in CGO/CJS workflows #37952 coverage-aware impacted-unit-test targets
safe / safe outputs / outputs (132, 78%) — Enforce required temporary_id for create-issue/create-pull-request via frontmatter and MCP validation #37469 enforce required temporary_id for create-issue/PR · Generalize early wildcard-target validation across safe-outputs MCP tools #39300 generalize wildcard-target validation across safe-output MCP tools
awf / version / cli (131, 75%) — Bump default MCP Gateway to v0.3.25 and firewall to v0.25.67, merge main, and regenerate pinned lock artifacts #37885 bump MCP Gateway v0.3.25 + firewall v0.25.67 + regen locks · Bump firewall to v0.27.6 and mcpg to v0.3.27 #40132 bump firewall v0.27.6 / mcpg v0.3.27
prompt / md / ambient (116, 78%) — Reduce ambient context in high-cost workflows via prompt deduplication and lazy subagent loading #37393 reduce ambient context via prompt dedup + lazy subagent loading · Reduce ambient-context payload in daily/PR workflows and shared prompt imports #39157 reduce ambient-context payload in daily/PR workflows
aic / usage / token (90, 76%) — feat: aggregate AI credits from aggregated usage JSONL files in conclusion post-step #38506 aggregate AI credits from usage JSONL · fix: restore AIC data in usage-only log collection #40786 restore AIC data in usage-only log collection
docs / model / models (86, 81%) — Switch model multipliers source of truth to GitHub Copilot pricing docs #36995 switch model multipliers to Copilot pricing docs · docs: expand cost management page with token reduction tips #36927 expand cost-management page
credits / ai credits (85, 83%) — Suppress false ET rate-limit failures with max-ai-credits budget reconciliation #37581 suppress false ET rate-limit failures via max-ai-credits reconciliation · fix: add max-ai-credits: 1500 to safe-output-health workflow #37506 add max-ai-credits: 1500 to safe-output-health
pkg / linters / refactor (76, 78%) — refactor: consolidate 6 identical hasStringKey copies into pkg/setutil.Contains #40534 consolidate 6 hasStringKey copies into pkg/setutil.Contains · perf: replace map[string]bool sets with map[string]struct{} (187 instances) #39954 replace map[string]bool with map[string]struct{} (187 instances)
sdk / driver / permission (72, 77%) — Fix Copilot SDK headless auth/driver path and tool-permission denials in daily workflows #37322 fix Copilot SDK headless auth/driver + tool-permission denials · refactor: split copilot_sdk_driver.cjs into reusable permission and session modules #37391 split copilot_sdk_driver.cjs into reusable modules
wip / failing github actions (19, 68%) — [WIP] Fix failing GitHub Actions job Integration: Workflow Features #40471 / [WIP] Fix failing GitHub Actions job for integration add #40239 / [WIP] Fix failing GitHub Actions job Integration: Workflow Features #40007 [WIP] Fix failing GitHub Actions job Integration: Workflow Features (recurring auto-dispatch)

Most-iterated PRs (highest review activity)

PR	Cluster	Reviews	Commits	Title
#40476	test/run	21	16	Fix `/help` routing fallthrough, error handling, reaction
#39810	test/run	20	7	add-wizard: detect org Copilot billing, pre-select/disable
#38456	credits	19	7	Add threat-detection `max-ai-credits` with 400 default
#40669	test/run	16	10	migrate `assignAgentToIssue` to REST, retain GraphQL fallback
#38237	safe-outputs	16	5	Add PR-targeting support to `create_check_run`

Methodology & caveats

Source signal: Copilot PR bodies are agent-authored change summaries, not the original task prompt (only ~8/1000 retain a START COPILOT marker). The PR title is the cleanest task descriptor, so it is weighted ×3 in the TF-IDF document; blockquote threat-banners, code fences, HTML comments, and URLs are stripped.
Vectorization: TF-IDF, 1–3-grams, min_df=3, max_df=0.5, sublinear TF, 600 features, domain stopwords (fix/add/workflow/gh/aw/...) removed so clusters key on topic not verb.
k selection: silhouette-swept k∈[6,12] → k=10 (silhouette 0.043). Separation is intentionally weak — this is one fast-moving codebase, so clusters are soft thematic regions, not hard partitions. Treat sizes/rates as directional.
Outcome (merged/closed) is taken from today's fresh search; interaction metrics (commits/files/reviews) from cached full-PR data. WIP detection = title starts with [wip.
1,000/1,000 PRs had usable text; none dropped.

Recommendations

Gate or throttle the [WIP] Fix failing GitHub Actions job auto-dispatcher. At 38.5% merge it generates ~24 abandoned PRs per window — the single biggest source of wasted agent runs. A human-triage gate before dispatch (see the in-flight [WIP] Fix failing GitHub Actions job for integration add #40239-style attempts) or a dedup guard on identical failing-job titles would recover most of this.
Split mechanical version bumps from logic changes. The awf/version cluster's 92-file lock-regeneration diffs depress its merge rate; isolating the artifact regeneration into a separate auto-merge-eligible PR would lift both review speed and success.
Lean into the strengths. Test-routing (86%) and credits/AIC (83%) work is both high-volume and high-yield — safe to keep auto-dispatching. No prompt-engineering changes needed there.

References: §28020644493

Generated by 📊 Copilot Agent Prompt Clustering Analysis · 226.6 AIC · ⌖ 18.6 AIC · ⊞ 13K · ◷

expires on Jun 24, 2026, 3:12 AM UTC-08:00

2026-06-24T11:02:45Z

github-actions[bot]
Bot Jun 24, 2026
Author

This discussion has been marked as outdated by Copilot Agent Prompt Clustering Analysis.

A newer discussion is available at Discussion #41210.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[prompt-clustering] Copilot Agent Prompt Clustering — 2026-06-23 #41004

Uh oh!

{{title}}

Uh oh!

Representative tasks per cluster

Most-iterated PRs (highest review activity)

Methodology & caveats

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[prompt-clustering] Copilot Agent Prompt Clustering — 2026-06-23 #41004

Uh oh!

github-actions[bot] Bot Jun 23, 2026

Summary

Key findings

Cluster analysis

Representative tasks per cluster

Most-iterated PRs (highest review activity)

Methodology & caveats

Recommendations

Replies: 1 comment

Uh oh!

github-actions[bot] Bot Jun 24, 2026 Author

github-actions[bot]
Bot Jun 23, 2026

github-actions[bot]
Bot Jun 24, 2026
Author