Add implementation failure summary to remote CI#9160
Conversation
|
Plan Queued for Implementation submission-queuedstatus: queued
queued_at: '2026-03-10T13:14:00.482042+00:00'
submitted_by: schrockn
validation_results:
pr_is_open: true
has_erk_pr_title: true
expected_workflow: plan-implement
trigger_mechanism: label-based-webhook
pr_number: 9160
Plan submitted by schrockn at 2026-03-10T13:14:00.482042+00:00. The Workflow run: https://github.com/dagster-io/erk/actions/runs/22904218520 |
|
⚙️ GitHub Action Started workflow-startedstatus: started
started_at: '2026-03-10T13:14:39Z'
workflow_run_id: '22904218520'
workflow_run_url: https://github.com/dagster-io/erk/actions/runs/22904218520
branch_name: plnd/add-impl-failure-summary-03-10-0912
pr_number: 9160
Setup completed successfully. Branch: |
When `claude --print` exits with code 1 during remote implementation, surface a human-readable failure diagnosis in the PR as a comment and GitHub Actions job summary. Reads the session JSONL tail, sends it to Haiku for diagnosis, and posts the markdown summary. ## Key Changes - New `erk exec summarize-impl-failure` command to analyze session logs and generate failure diagnostics - Added `.github/prompts/impl-failure-summarize.md` template for Haiku prompt - Integrated failure summarization into `plan-implement.yml` workflow to run on implementation failures - Pure functions for session extraction, prompt building, and comment formatting with comprehensive unit tests <details> <summary>Files Changed</summary> ### Added (3 files) - `src/erk/cli/commands/exec/scripts/summarize_impl_failure.py` - Implementation failure diagnosis using Haiku with session JSONL parsing and PR commenting - `tests/unit/cli/commands/exec/scripts/test_summarize_impl_failure.py` - Unit tests for session extraction, prompt building, and comment body formatting - `.github/prompts/impl-failure-summarize.md` - Haiku prompt template for failure analysis ### Modified (4 files) - `src/erk/cli/commands/exec/group.py` - Register new summarize-impl-failure command - `.github/workflows/plan-implement.yml` - Add workflow step to summarize failures and write to GITHUB_STEP_SUMMARY - `.claude/skills/erk-exec/reference.md` - Document new exec command and options - `docs/learned/integrations/github-review-decision.md`, `docs/learned/tui/status-indicators.md` - Minor formatting fixes </details>
67009c0 to
924b233
Compare
✅ Dignified Code Simplifier ReviewLast updated: 2026-03-10 10:42:54 PT Found 0 violations. All code follows dignified-python standards. DetailsPatterns Checked✅ LBYL over EAFP - Compliant Files Reviewed
Activity Log
|
✅ Test Coverage ReviewLast updated: 2026-03-10 10:43:04 PT Found 0 violations. All new source files have test coverage. DetailsPatterns Checked✅ All new source files have test coverage Source Files
Files Reviewed
Activity Log
|
✅ Dignified Python ReviewLast updated: 2026-03-10 10:42:54 PT Found 0 violations across 3 Python files. All code complies with dignified-python standards. DetailsPatterns Checked✅ LBYL over EAFP - None found Files Reviewed
Activity Log
|
✅ Tripwires ReviewLast updated: 2026-03-10 07:39:40 PT No tripwire violations detected. Code follows all established patterns. DetailsTripwires TriggeredArchitecture:
CLI:
Testing:
CI:
Tier 1 Pattern Matches (Mechanical)✅ Tier 2 Semantic Matches (LLM-derived)✅ Using bare subprocess calls - None found Violations SummaryNo violations found. Files Reviewed
Activity Log
|
- Simplify construct-and-append loop to list comprehension - Remove default=None from --exit-code Click option Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

When
claude --printexits with code 1 during remote implementation, surface a human-readable failure diagnosis in the PR as a comment and GitHub Actions job summary. Reads the session JSONL tail, sends it to Haiku for diagnosis, and posts the markdown summary.Key Changes
erk exec summarize-impl-failurecommand to analyze session logs and generate failure diagnostics.github/prompts/impl-failure-summarize.mdtemplate for Haiku promptplan-implement.ymlworkflow to run on implementation failuresFiles Changed
Added (3 files)
src/erk/cli/commands/exec/scripts/summarize_impl_failure.py- Implementation failure diagnosis using Haiku with session JSONL parsing and PR commentingtests/unit/cli/commands/exec/scripts/test_summarize_impl_failure.py- Unit tests for session extraction, prompt building, and comment body formatting.github/prompts/impl-failure-summarize.md- Haiku prompt template for failure analysisModified (2 files)
src/erk/cli/commands/exec/group.py- Register new summarize-impl-failure command.github/workflows/plan-implement.yml- Add workflow step to summarize failures and write to GITHUB_STEP_SUMMARY.claude/skills/erk-exec/reference.md- Document new exec command and optionsoriginal-plan
Plan: Add Implementation Failure Summary to Remote CI
Context
When
claude --printexits with code 1 during remote implementation inplan-implement.yml, the only error message is "Implementation failed with exit code: 1". This is useless for debugging. The session JSONL is already captured and pushed toplanned-pr-context, but nobody analyzes it. We want to use Haiku to produce a human-readable failure summary and surface it in two places: as a PR comment on the plan PR, and as a GitHub Actions job summary (visible on the Actions run page).The existing
ci-generate-summariespattern (fetch logs -> truncate -> prompt Haiku -> post PR comment) provides the exact template to follow.Approach
Create a new exec command
erk exec summarize-impl-failurethat reads the raw session JSONL, extracts the tail, sends it to Haiku for diagnosis, posts the summary as a PR comment, and outputs it for use as a job summary. Add a workflow step that calls this on failure and writes the output to$GITHUB_STEP_SUMMARY.Files to Create
1.
.github/prompts/impl-failure-summarize.mdHaiku prompt template. Focused on: what was the agent doing when it stopped, did it encounter an error or just stop, what files/operations were involved.
2.
src/erk/cli/commands/exec/scripts/summarize_impl_failure.pyNew exec script following the
ci_generate_summaries.pypattern.CLI:
erk exec summarize-impl-failure --session-file <path> --pr-number <N> [--exit-code <N>]Key functions:
_extract_session_tail(session_file: Path, *, max_entries: int) -> SessionTail | None— Read JSONL, take last N entries (50), convert to compressed XML viagenerate_compressed_xml()frompreprocess_session.py. Return dataclass withtotal_events,last_entries_xml,has_result_event._build_failure_prompt(*, session_tail: SessionTail, exit_code: int | None, prompts_dir: Path) -> str— Load template, substitute variables. Follow_build_summary_prompt()pattern fromci_generate_summaries.pyusingget_bundled_github_dir()._post_failure_comment(*, pr_number: int, comment_body: str, cwd: Path) -> None— Post viarun_subprocess_with_context+gh pr comment.Flow:
require_cwd(ctx),require_prompt_executor(ctx)executor.execute_prompt()## Implementation Failure Summaryheader$GITHUB_STEP_SUMMARY)Reuse:
generate_compressed_xml()frompreprocess_session.py— converts JSONL entries to compact XMLget_bundled_github_dir()fromerk.artifacts.paths— locates prompt templatesrequire_prompt_executor()fromerk_shared.context.helpers— Haiku accessrun_subprocess_with_context()fromerk_shared.subprocess_utils—gh pr comment3.
tests/unit/cli/commands/exec/scripts/test_summarize_impl_failure.pyTest pure functions:
_extract_session_tail(empty, small, normal, large sessions),_build_failure_prompt(template substitution, fallback), comment body formatting.Files to Modify
4.
src/erk/cli/commands/exec/group.pyAdd import and registration:
5.
.github/workflows/plan-implement.ymlInsert new step AFTER "Update plan header" (line ~313) and BEFORE "Handle implementation outcome" (line ~315):
Key:
continue-on-error: trueso failures here don't block the workflow. The script posts a PR comment internally and prints the summary to stdout, which the workflow captures and writes to$GITHUB_STEP_SUMMARYfor the Actions run page.Verification
pytest tests/unit/cli/commands/exec/scripts/test_summarize_impl_failure.pyerk exec summarize-impl-failure --session-file /tmp/test.jsonl --pr-number 9999 --exit-code 1(will fail at Haiku call without API key but validates parsing)plan-header
To replicate this PR locally, run: