Skip to content

feat(jira): add summarize_issue.py and enrich gathered data with author names#484

Open
celebdor wants to merge 1 commit into
openshift-eng:mainfrom
celebdor:summarize-issue-script
Open

feat(jira): add summarize_issue.py and enrich gathered data with author names#484
celebdor wants to merge 1 commit into
openshift-eng:mainfrom
celebdor:summarize-issue-script

Conversation

@celebdor
Copy link
Copy Markdown
Contributor

@celebdor celebdor commented May 19, 2026

What this PR does / why we need it:

Adds a summarize_issue.py script that extracts structured, human-readable summaries from pre-gathered JSON files, avoiding context window overflow when processing large issue data during weekly status updates. Also enriches gather_status_data.py to capture author_name (display names) for PR reviews, commits, and review comments alongside login handles, and collects issue labels. Updates the update-weekly-status command to use the summarizer instead of reading raw JSON directly.

Which issue(s) this PR fixes:

Special notes for your reviewer:

Split out from a larger branch that also adds /jira:generate-feature-updates — this PR contains only the general-purpose improvements to the status analysis pipeline.

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Documentation

    • Clarified the issue summarization step for the weekly status command.
  • New Features

    • Added a CLI report that produces structured, human-readable issue status analyses (headers, descendants, changelog, comments, PR categorization, significance filtering).
  • Improvements

    • Collected richer author identity details from Git hosting data and included Jira issue labels in gathered outputs.
  • Chores

    • Bumped Jira plugin version to 0.4.6.

@openshift-ci openshift-ci Bot requested review from bentito and stbenjam May 19, 2026 15:36
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 19, 2026

Walkthrough

Adds a CLI summarizer for pre-gathered per-issue JSON, enriches GitHub GraphQL PR data and Jira issue output with additional author metadata and labels, and updates docs plus plugin/version metadata to reference the new summarization step.

Changes

Issue Summarization Tool with Enriched Data Gathering

Layer / File(s) Summary
Plugin and docs version bumps
.claude-plugin/marketplace.json, docs/data.json, plugins/jira/.claude-plugin/plugin.json
Increment Jira plugin and docs top-level version from 0.4.5 to 0.4.6.
Enhanced data gathering with author metadata and labels
plugins/jira/skills/status-analysis/scripts/gather_status_data.py
GraphQL PR query now requests richer author fields for reviews, review-thread comments, and commits. _filter_pr_to_range records both author and author_name in reviews_in_range, commits_in_range, and review_comments_in_range. Per-issue output includes a string-filtered labels list under issue.issue.labels.
Issue summarization script implementation
plugins/jira/skills/status-analysis/scripts/summarize_issue.py
New CLI script with pr_author and is_significant helpers; summarize() prints structured reports (header, descendants, changelog, non-bot comments, categorized PR lists).
CLI helpers and control flow
plugins/jira/skills/status-analysis/scripts/summarize_issue.py
Adds resolve_path, extract_date_start, parse_flag, parse_option, and main argv-driven control flow to expand globs, infer date_start, filter by significance, handle --label and --only-significant, and iterate inputs.
Documentation update for summarization workflow
plugins/jira/commands/update-weekly-status.md
Step 7a now instructs running summarize_issue.py against pre-gathered per-issue JSON and documents the script’s structured summary sections; “Related” links to the summarizer script.

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • openshift-eng/ai-helpers#165: Prior work on the shared status-analysis pipeline that this PR extends (gather_status_data and the update-weekly-status flow).

Suggested labels

approved, ok-to-test, lgtm

Suggested reviewers

  • stbenjam
  • bentito
  • rvanderp3

Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 error)

Check name Status Explanation Resolution
No Real People Names In Style References ❌ Error Plugin documentation uses real person "Antoni Segura" (commit author) in example prompts for user filter parameters, violating policy against real people names in plugin documentation. Replace real names with generic placeholders or fictional examples to avoid references to actual individuals in plugin command examples.
✅ Passed checks (9 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main changes: adding a new summarize_issue.py script and enriching gathered data with author names.
Docstring Coverage ✅ Passed Docstring coverage is 90.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
No Assumed Git Remote Names ✅ Passed No hardcoded git remote names (origin/upstream) found in any modified files. The PR adds new scripts and documentation that use only GitHub GraphQL API and CLI tools, not hardcoded git commands.
Git Push Safety Rules ✅ Passed No git push commands, force push operations, or main/master branch pushes found. Scripts are utilities without git operations.
No Untrusted Mcp Servers ✅ Passed PR adds Python scripts with only standard library imports and documentation. No new MCP servers from untrusted sources detected. References existing official Jira MCP tools.
Ai-Helpers Overlap Detection ✅ Passed PR adds summarize_issue.py utility to jira plugin for per-issue JSON summarization. No overlap with existing status-analysis tools (different plugins, scopes, and purposes).
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@plugins/jira/skills/status-analysis/scripts/gather_status_data.py`:
- Line 1225: The list comprehension creating labels uses a single-letter loop
variable `l` which triggers Ruff E741; change the loop variable in that
comprehension to a descriptive name like `label` (e.g., labels = [label for
label in fields.get("labels", []) if isinstance(label, str)]) so the
comprehension in gather_status_data.py that references fields.get("labels", [])
uses a non-ambiguous identifier.

In `@plugins/jira/skills/status-analysis/scripts/summarize_issue.py`:
- Line 240: Replace unnecessary f-strings that have no interpolation with plain
string literals in summarize_issue.py: locate the print calls using f-strings
(e.g., the print(f"Examples:", file=sys.stderr) and the similar print at the
later occurrence) and change them to use normal strings (print("Examples:",
file=sys.stderr)). Update any other non-interpolated f-strings in the same file
to avoid Ruff F541 warnings, keeping the rest of the print arguments (like
file=sys.stderr) unchanged.
- Around line 67-73: The code checks color changes using
item.get("fromString")/item.get("toString") but the gathered changelog stores
values under "from"/"to", causing summaries to show "? -> ?"; update both places
(the block computing from_has/to_has around variables field/field_id and the
similar block at lines 113-117) to read item.get("from") and item.get("to")
(falling back to fromString/toString only if absent), convert them to strings
before searching for "Red"/"Yellow"/"Green", and use those values when building
the changelog summary so the output shows the actual from -> to values instead
of "? -> ?".
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 0269f937-2556-4d1f-a224-6bb003cff6ea

📥 Commits

Reviewing files that changed from the base of the PR and between a6f6680 and fa42fdf.

📒 Files selected for processing (3)
  • plugins/jira/commands/update-weekly-status.md
  • plugins/jira/skills/status-analysis/scripts/gather_status_data.py
  • plugins/jira/skills/status-analysis/scripts/summarize_issue.py


# Build issue data
assignee = fields.get("assignee") or {}
labels = [l for l in fields.get("labels", []) if isinstance(l, str)]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Rename the one-letter loop variable to avoid Ruff E741.

Line 1225 uses l, which is flagged as ambiguous and may fail lint checks.

Suggested fix
-            labels = [l for l in fields.get("labels", []) if isinstance(l, str)]
+            labels = [label for label in fields.get("labels", []) if isinstance(label, str)]
🧰 Tools
🪛 Ruff (0.15.13)

[error] 1225-1225: Ambiguous variable name: l

(E741)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/jira/skills/status-analysis/scripts/gather_status_data.py` at line
1225, The list comprehension creating labels uses a single-letter loop variable
`l` which triggers Ruff E741; change the loop variable in that comprehension to
a descriptive name like `label` (e.g., labels = [label for label in
fields.get("labels", []) if isinstance(label, str)]) so the comprehension in
gather_status_data.py that references fields.get("labels", []) uses a
non-ambiguous identifier.

Comment on lines +67 to +73
field = item.get("field", "")
field_id = str(item.get("fieldId", ""))
if "Status Summary" in field or "customfield_12320841" in field_id:
for color in ("Red", "Yellow", "Green"):
from_has = color in (item.get("fromString") or "")
to_has = color in (item.get("toString") or "")
if from_has != to_has:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use the gathered changelog keys (from/to) when summarizing.

The summarizer reads fromString/toString, but gathered issue JSON stores from/to. This makes changelog output degrade to ? -> ? and can miss color-change signals.

Suggested fix
-                    from_has = color in (item.get("fromString") or "")
-                    to_has = color in (item.get("toString") or "")
+                    before = item.get("from", item.get("fromString")) or ""
+                    after = item.get("to", item.get("toString")) or ""
+                    from_has = color in before
+                    to_has = color in after
@@
-            f"{i.get('field', '?')}: {i.get('fromString', '?')} -> {i.get('toString', '?')}"
+            f"{i.get('field', '?')}: "
+            f"{i.get('from', i.get('fromString', '?'))} -> "
+            f"{i.get('to', i.get('toString', '?'))}"

Also applies to: 113-117

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/jira/skills/status-analysis/scripts/summarize_issue.py` around lines
67 - 73, The code checks color changes using
item.get("fromString")/item.get("toString") but the gathered changelog stores
values under "from"/"to", causing summaries to show "? -> ?"; update both places
(the block computing from_has/to_has around variables field/field_id and the
similar block at lines 113-117) to read item.get("from") and item.get("to")
(falling back to fromString/toString only if absent), convert them to strings
before searching for "Red"/"Yellow"/"Green", and use those values when building
the changelog summary so the output shows the actual from -> to values instead
of "? -> ?".


if not args:
print(f"Usage: {sys.argv[0]} <ISSUE-KEY|issue.json|issues-dir/> [...] [--date-dir DIR] [--only-significant] [--label LABEL]", file=sys.stderr)
print(f"Examples:", file=sys.stderr)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Remove unnecessary f-strings to satisfy Ruff F541.

These strings have no interpolation and should be plain string literals.

Suggested fix
-        print(f"Examples:", file=sys.stderr)
+        print("Examples:", file=sys.stderr)
@@
-            print(f"Confirm with the user whether to include them in the report.\n")
+            print("Confirm with the user whether to include them in the report.\n")

Also applies to: 306-306

🧰 Tools
🪛 Ruff (0.15.13)

[error] 240-240: f-string without any placeholders

Remove extraneous f prefix

(F541)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/jira/skills/status-analysis/scripts/summarize_issue.py` at line 240,
Replace unnecessary f-strings that have no interpolation with plain string
literals in summarize_issue.py: locate the print calls using f-strings (e.g.,
the print(f"Examples:", file=sys.stderr) and the similar print at the later
occurrence) and change them to use normal strings (print("Examples:",
file=sys.stderr)). Update any other non-interpolated f-strings in the same file
to avoid Ruff F541 warnings, keeping the rest of the print arguments (like
file=sys.stderr) unchanged.

@brandisher
Copy link
Copy Markdown

/assign

…or names

Add a summarize_issue.py script that extracts structured, human-readable
summaries from pre-gathered JSON files, avoiding context window overflow
when processing large issue data. Update update-weekly-status to use it.

Enrich gather_status_data.py to capture author_name (display names) for
PR reviews, commits, and review comments alongside login handles, and
collect issue labels.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@celebdor celebdor force-pushed the summarize-issue-script branch from fa42fdf to 9408802 Compare May 20, 2026 13:04
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented May 20, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: celebdor
Once this PR has been reviewed and has the lgtm label, please ask for approval from brandisher. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
plugins/jira/skills/status-analysis/scripts/summarize_issue.py (1)

67-73: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use gathered changelog keys (from/to) instead of only fromString/toString.

The summarizer is still reading the wrong keys for this pipeline contract, so color transitions and changelog lines can be incorrect.

Proposed fix
-                    from_has = color in (item.get("fromString") or "")
-                    to_has = color in (item.get("toString") or "")
+                    before = str(item.get("from", item.get("fromString", "")) or "")
+                    after = str(item.get("to", item.get("toString", "")) or "")
+                    from_has = color in before
+                    to_has = color in after
@@
-            f"{i.get('field', '?')}: {i.get('fromString', '?')} -> {i.get('toString', '?')}"
+            f"{i.get('field', '?')}: "
+            f"{i.get('from', i.get('fromString', '?'))} -> "
+            f"{i.get('to', i.get('toString', '?'))}"

Also applies to: 113-115

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/jira/skills/status-analysis/scripts/summarize_issue.py` around lines
67 - 73, The code is checking changelog color transitions using only
item.get("fromString")/item.get("toString") which misses cases where the
changelog uses keys "from"/"to"; update the color-detection logic in the loop
that inspects item (the block using field, field_id, color, from_has, to_has) to
check item.get("from") and item.get("to") (falling back to fromString/toString
if needed) — i.e. compute from_val = (item.get("from") or item.get("fromString")
or "") and to_val = (item.get("to") or item.get("toString") or "") and then use
color in from_val / to_val for from_has/to_has; make the same change for the
other occurrence around the lines referenced (the second block using
fromString/toString).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@plugins/jira/skills/status-analysis/scripts/summarize_issue.py`:
- Around line 67-73: The code is checking changelog color transitions using only
item.get("fromString")/item.get("toString") which misses cases where the
changelog uses keys "from"/"to"; update the color-detection logic in the loop
that inspects item (the block using field, field_id, color, from_has, to_has) to
check item.get("from") and item.get("to") (falling back to fromString/toString
if needed) — i.e. compute from_val = (item.get("from") or item.get("fromString")
or "") and to_val = (item.get("to") or item.get("toString") or "") and then use
color in from_val / to_val for from_has/to_has; make the same change for the
other occurrence around the lines referenced (the second block using
fromString/toString).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 5d84fa42-1a2d-4869-a5e3-292efbf7820a

📥 Commits

Reviewing files that changed from the base of the PR and between fa42fdf and 9408802.

📒 Files selected for processing (6)
  • .claude-plugin/marketplace.json
  • docs/data.json
  • plugins/jira/.claude-plugin/plugin.json
  • plugins/jira/commands/update-weekly-status.md
  • plugins/jira/skills/status-analysis/scripts/gather_status_data.py
  • plugins/jira/skills/status-analysis/scripts/summarize_issue.py
✅ Files skipped from review due to trivial changes (3)
  • plugins/jira/.claude-plugin/plugin.json
  • .claude-plugin/marketplace.json
  • docs/data.json

@openshift-ci openshift-ci Bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 21, 2026
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented May 21, 2026

PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants