[prompt-clustering] Copilot Agent Prompt Clustering — 2026-06-23 #41004
Closed
Replies: 1 comment
-
|
This discussion has been marked as outdated by Copilot Agent Prompt Clustering Analysis. A newer discussion is available at Discussion #41210. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Summary
Analysis date: 2026-06-23 · Window: 1,000 most-recent Copilot agent PRs (created 2026-06-03 → 06-23) · Clusters: 10 · Overall merge rate: 79.9%
NLP clustering (TF-IDF + K-means, title-weighted ×3) over 1,000
copilot-swe-agentPRs ingithub/gh-aw. Merge rate is steady (79.9% today vs 79.2% on 06-22). The one persistent drag remains the[WIP]"fix failing CI" auto-dispatch loop — 39 PRs at a 38.5% merge rate, vs 81.5% for every other task type.Key findings
test / run / failurecluster (193 PRs, 86% merged) covers command-dispatch routing, checkout hardening, and impacted-test tooling — high volume and high success, so the agent handles this domain reliably.[WIP]CI-self-fixer is the chronic failure mode. 19 PRs land in a purewip failing / github actionscluster at 68%, and across all 39 WIP PRs the merge rate is 38.5% (down from 41% on 06-22). These are auto-generated "Fix failing GitHub Actions job" attempts — they iterate little (avg 2.5 commits, 1.2 reviews) and are abandoned more than half the time.awf / version / clicluster (131 PRs, 75% — lowest of the real-work clusters) carries by far the largest changesets (avg 92 files, vs 35 repo-wide) because lock-artifact regeneration touches every compiled.lock.yml. Big mechanical diffs → more review churn and more abandonment (32/131 closed unmerged).credits / ai credits(85 PRs, 83%) andaic / usage / tokentelemetry (90 PRs, 76%) together are ~18% of all activity — AI-cost engineering is now a dominant, mostly-successful theme.Cluster analysis
Cluster detail, representative PRs & methodology
Representative tasks per cluster
temporary_idforcreate-issue/create-pull-requestvia frontmatter and MCP validation #37469 enforce requiredtemporary_idfor create-issue/PR · Generalize early wildcard-target validation across safe-outputs MCP tools #39300 generalize wildcard-target validation across safe-output MCP toolsmax-ai-creditsbudget reconciliation #37581 suppress false ET rate-limit failures viamax-ai-creditsreconciliation · fix: add max-ai-credits: 1500 to safe-output-health workflow #37506 addmax-ai-credits: 1500to safe-output-healthhasStringKeycopies intopkg/setutil.Contains· perf: replace map[string]bool sets with map[string]struct{} (187 instances) #39954 replacemap[string]boolwithmap[string]struct{}(187 instances)copilot_sdk_driver.cjsinto reusable modules[WIP] Fix failing GitHub Actions job Integration: Workflow Features(recurring auto-dispatch)Most-iterated PRs (highest review activity)
/helprouting fallthrough, error handling, reactionmax-ai-creditswith 400 defaultassignAgentToIssueto REST, retain GraphQL fallbackcreate_check_runMethodology & caveats
START COPILOTmarker). The PR title is the cleanest task descriptor, so it is weighted ×3 in the TF-IDF document; blockquote threat-banners, code fences, HTML comments, and URLs are stripped.min_df=3,max_df=0.5, sublinear TF, 600 features, domain stopwords (fix/add/workflow/gh/aw/...) removed so clusters key on topic not verb.[wip.Recommendations
[WIP] Fix failing GitHub Actions jobauto-dispatcher. At 38.5% merge it generates ~24 abandoned PRs per window — the single biggest source of wasted agent runs. A human-triage gate before dispatch (see the in-flight [WIP] Fix failing GitHub Actions job for integration add #40239-style attempts) or a dedup guard on identical failing-job titles would recover most of this.awf/versioncluster's 92-file lock-regeneration diffs depress its merge rate; isolating the artifact regeneration into a separate auto-merge-eligible PR would lift both review speed and success.References: §28020644493
Beta Was this translation helpful? Give feedback.
All reactions