Add ai-docs telemetry analysis to metrics plugin by Prashanth684 · Pull Request #450 · openshift-eng/ai-helpers

Prashanth684 · 2026-05-06T23:27:32Z

Follow up for: #437

Add /metrics:ai-docs-telemetry command to analyze Claude Code session logs for agentic documentation usage patterns. Tracks which ai-docs files are accessed, entry points (AGENTS.md vs direct search), and navigation patterns.

Usage:

Scan all recent sessions (last 7 days)

/metrics:ai-docs-telemetry -scan

Scan specific project

/metrics:ai-docs-telemetry -scan -project enhancements

Analyze specific session

/metrics:ai-docs-telemetry -session ~/.claude/projects//.jsonl

Pipe to jq for analysis

/metrics:ai-docs-telemetry -scan | jq -r '.[] | .documentation.entry_point' | sort | uniq -c

Summary by CodeRabbit

New Features
- Added AI documentation usage telemetry command to track when users access AI-related documentation files
- Enables scanning of recent sessions with optional project filtering
- Supports analyzing individual session files and exporting results in JSON format for further analysis

Add /metrics:ai-docs-telemetry command to analyze Claude Code session logs for agentic documentation usage patterns. Tracks which ai-docs files are accessed, entry points (AGENTS.md vs direct search), and navigation patterns. Usage: # Scan all recent sessions (last 7 days) /metrics:ai-docs-telemetry -scan # Scan specific project /metrics:ai-docs-telemetry -scan -project enhancements # Analyze specific session /metrics:ai-docs-telemetry -session ~/.claude/projects/<project>/<session-id>.jsonl # Pipe to jq for analysis /metrics:ai-docs-telemetry -scan | jq -r '.[] | .documentation.entry_point' | sort | uniq -c

openshift-ci · 2026-05-06T23:27:40Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Prashanth684

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [Prashanth684]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coderabbitai · 2026-05-06T23:27:44Z

Walkthrough

This pull request adds AI Docs telemetry capabilities to the metrics plugin. A new command scans Claude Code session JSONL logs to detect Read tool calls to ai-docs paths, emitting structured telemetry events as JSON output. Documentation and Python implementation support session scanning with project filtering or single-session analysis.

Changes

AI Docs Telemetry Feature

Layer / File(s)	Summary
Data Models & Core Processing `plugins/metrics/scripts/ai_docs_telemetry.py` (lines 1–171)	Defines `FileAccess`, `PlatformInfo`, `RepositoryInfo`, `DocumentationInfo`, and `TelemetryEvent` dataclasses. Implements `extract_repo_info()`, `detect_entry_point()`, and `process_session()` to parse JSONL logs, filter for Read tool calls to ai-docs paths, and emit structured telemetry.
Session Scanning & CLI `plugins/metrics/scripts/ai_docs_telemetry.py` (lines 174–248)	Implements `scan_recent_sessions()` to traverse recent `~/.claude/projects/*/.jsonl` files (7-day window) with optional project substring filtering. Adds `main()` entry point supporting `-scan`, `-project`, and `-session` CLI modes. Outputs JSON to stdout and summary to stderr.
Command Documentation `plugins/metrics/commands/ai-docs-telemetry.md`	Documents command metadata, synopsis, description, implementation details, return values, and usage examples including scanning recent sessions, filtering by project, analyzing single files, and piping to `jq`.
Plugin Integration `plugins/metrics/README.md`	Updates overview and Commands section to introduce `/metrics:ai-docs-telemetry` with links to command documentation and quick-start examples. Expands Source Code section to list the new telemetry script and commands directory.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 10

✅ Passed checks (10 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely describes the main change: adding ai-docs telemetry analysis capability to the metrics plugin, which aligns with all three file changes (README updates, command documentation, and new Python script).
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
No Real People Names In Style References	✅ Passed	No references to real people by name found in plugin commands, documentation, or example prompts. All content is technical and functional, with no style references using real person names.
No Assumed Git Remote Names	✅ Passed	No git remote operations or hardcoded remote names found. PR adds ai-docs telemetry tracking, not involving git operations.
Git Push Safety Rules	✅ Passed	The PR adds telemetry analysis for ai-docs usage. No git push operations, force pushes, or autonomous push workflows are present in any of the new files.
No Untrusted Mcp Servers	✅ Passed	No MCP server installations detected in PR. Adds only documentation and Python script with standard libraries.
Ai-Helpers Overlap Detection	✅ Passed	No overlapping functionality detected. PR adds unique ai-docs telemetry command to metrics plugin. No existing commands track documentation usage. Command name is unique.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@plugins/metrics/commands/ai-docs-telemetry.md`:
- Around line 10-13: Fenced code blocks in the ai-docs-telemetry markdown are
missing language identifiers and trigger markdownlint MD040; update each
triple-backtick block that contains CLI examples (e.g. blocks containing
"/metrics:ai-docs-telemetry -scan [-project <name>]",
"/metrics:ai-docs-telemetry -session <path-to-session.jsonl>",
"/metrics:ai-docs-telemetry -scan", "/metrics:ai-docs-telemetry -scan -project
enhancements", "/metrics:ai-docs-telemetry -scan -project
machine-config-operator", and the session path example like
"~/.claude/projects/<project>/<session-id>.jsonl") to include a language tag
(use bash) immediately after the opening ````` so each block starts with
```bash.

In `@plugins/metrics/scripts/ai_docs_telemetry.py`:
- Around line 144-148: The telemetry currently appends the raw file_path into
ai_docs_files (via FileAccess) which can leak local identifiers; add a sanitizer
function (e.g., redact_documentation_path) outside this block and call it before
creating FileAccess so that you store a redacted path instead of the raw
file_path; update the code that constructs FileAccess (the
ai_docs_files.append(...) call) to pass redact_documentation_path(file_path) for
the path field and keep sequence and time unchanged to preserve ordering and
timestamps.
- Around line 236-241: The -scan branch currently only prints JSON when events
is truthy; change it so it always emits a JSON array (possibly empty) from
scan_recent_sessions(args.project) — call scan_recent_sessions into events and
unconditionally print json.dumps([asdict(e) for e in events], indent=2) even if
events is empty, ensuring downstream jq pipelines always receive valid JSON;
update the block around args.scan, scan_recent_sessions, events and the asdict
conversion accordingly.
- Around line 204-209: The pre-filter in the loop around
session_file.read_text() wrongly only checks for "ai-docs/" or "AGENTS.md" and
thus skips valid sessions that reference "CLAUDE.md"; also it swallows read
exceptions silently. Update the predicate to include "CLAUDE.md" (e.g., check
for "ai-docs/" or "AGENTS.md" or "CLAUDE.md") so those sessions are not skipped,
and change the except Exception block in the same scope (around
session_file.read_text()) to log the exception (using the existing logger) with
context about the file instead of silently continuing so read errors are visible
for telemetry counting. Ensure you modify the checks and the error handling
where session_file.read_text() is called and the surrounding try/except.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c80a3b1f-782d-4b98-84df-34ba44fb0ab3

📥 Commits

Reviewing files that changed from the base of the PR and between d2de5a1 and db733b1.

📒 Files selected for processing (3)

plugins/metrics/README.md
plugins/metrics/commands/ai-docs-telemetry.md
plugins/metrics/scripts/ai_docs_telemetry.py

coderabbitai · 2026-05-06T23:31:16Z

+```
+/metrics:ai-docs-telemetry -scan [-project <name>]
+/metrics:ai-docs-telemetry -session <path-to-session.jsonl>
+```


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add language identifiers to fenced code blocks.

Several fenced blocks are missing a language tag, which triggers markdownlint MD040 and can fail/pollute docs CI.

🛠️ Suggested fix

-``` +```bash /metrics:ai-docs-telemetry -scan [-project <name>] /metrics:ai-docs-telemetry -session <path-to-session.jsonl>

/metrics:ai-docs-telemetry -scan

/metrics:ai-docs-telemetry -scan -project enhancements

/metrics:ai-docs-telemetry -scan -project machine-config-operator

/metrics:ai-docs-telemetry -session ~/.claude/projects/<project>/<session-id>.jsonl

</details> Also applies to: 45-47, 64-66, 69-71, 74-76 <details> <summary>🧰 Tools</summary> <details> <summary>🪛 markdownlint-cli2 (0.22.1)</summary> [warning] 10-10: Fenced code blocks should have a language specified (MD040, fenced-code-language) </details> </details> <details> <summary>🤖 Prompt for AI Agents</summary>

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @plugins/metrics/commands/ai-docs-telemetry.md around lines 10 - 13, Fenced
code blocks in the ai-docs-telemetry markdown are missing language identifiers
and trigger markdownlint MD040; update each triple-backtick block that contains
CLI examples (e.g. blocks containing "/metrics:ai-docs-telemetry -scan [-project
]", "/metrics:ai-docs-telemetry -session <path-to-session.jsonl>",
"/metrics:ai-docs-telemetry -scan", "/metrics:ai-docs-telemetry -scan -project
enhancements", "/metrics:ai-docs-telemetry -scan -project
machine-config-operator", and the session path example like
"~/.claude/projects//.jsonl") to include a language tag
(use bash) immediately after the opening ````` so each block starts with

coderabbitai · 2026-05-06T23:31:16Z

+                    ai_docs_files.append(FileAccess(
+                        path=file_path,
+                        sequence=len(ai_docs_files) + 1,
+                        time=timestamp
+                    ))


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Raw file_path in telemetry can leak local identifiers.

Line 145 stores the full tool input path. Absolute paths can expose usernames or sensitive local structure, conflicting with anonymous telemetry goals.

🔒 Suggested fix

ai_docs_files.append(FileAccess( - path=file_path, + path=redact_documentation_path(file_path), sequence=len(ai_docs_files) + 1, time=timestamp ))

Add a small sanitizer helper (outside this range), for example:

def redact_documentation_path(file_path: str) -> str: normalized = file_path.replace("\\", "/") if "ai-docs/" in normalized: return "ai-docs/" + normalized.split("ai-docs/", 1)[1] return pathlib.PurePath(normalized).name # AGENTS.md / CLAUDE.md fallback

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugins/metrics/scripts/ai_docs_telemetry.py` around lines 144 - 148, The telemetry currently appends the raw file_path into ai_docs_files (via FileAccess) which can leak local identifiers; add a sanitizer function (e.g., redact_documentation_path) outside this block and call it before creating FileAccess so that you store a redacted path instead of the raw file_path; update the code that constructs FileAccess (the ai_docs_files.append(...) call) to pass redact_documentation_path(file_path) for the path field and keep sequence and time unchanged to preserve ordering and timestamps.

coderabbitai · 2026-05-06T23:31:16Z

+        try:
+            content = session_file.read_text()
+            if not ("ai-docs/" in content or "AGENTS.md" in content):
+                continue
+        except Exception:
+            continue


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Pre-filter drops valid CLAUDE.md sessions (and silently hides read errors).

Line 206 pre-filtering omits CLAUDE.md, so sessions that only touch that entry point are skipped before parsing. Also, read failures are swallowed, which masks undercounting.

✅ Suggested fix

try: content = session_file.read_text() - if not ("ai-docs/" in content or "AGENTS.md" in content): + if not ("ai-docs/" in content or "AGENTS.md" in content or "CLAUDE.md" in content): continue - except Exception: + except (OSError, UnicodeError) as e: + print(f"Skipping unreadable session {session_file}: {e}", file=sys.stderr) continue

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

try:

content = session_file.read_text()

if not ("ai-docs/" in content or "AGENTS.md" in content):

continue

except Exception:

continue

try:

content = session_file.read_text()

if not ("ai-docs/" in content or "AGENTS.md" in content or "CLAUDE.md" in content):

continue

except (OSError, UnicodeError) as e:

print(f"Skipping unreadable session {session_file}: {e}", file=sys.stderr)

continue

🧰 Tools

🪛 Ruff (0.15.12)

[error] 208-209: try-except-continue detected, consider logging the exception

(S112)

[warning] 208-208: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugins/metrics/scripts/ai_docs_telemetry.py` around lines 204 - 209, The pre-filter in the loop around session_file.read_text() wrongly only checks for "ai-docs/" or "AGENTS.md" and thus skips valid sessions that reference "CLAUDE.md"; also it swallows read exceptions silently. Update the predicate to include "CLAUDE.md" (e.g., check for "ai-docs/" or "AGENTS.md" or "CLAUDE.md") so those sessions are not skipped, and change the except Exception block in the same scope (around session_file.read_text()) to log the exception (using the existing logger) with context about the file instead of silently continuing so read errors are visible for telemetry counting. Ensure you modify the checks and the error handling where session_file.read_text() is called and the surrounding try/except.

coderabbitai · 2026-05-06T23:31:16Z

+    if args.scan:
+        events = scan_recent_sessions(args.project)
+        if events:
+            # Output as JSON array
+            print(json.dumps([asdict(e) for e in events], indent=2))
+    elif args.session:


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Always emit JSON for -scan (including empty results).

Current behavior prints nothing when no events are found. That breaks JSON-contract expectations and makes downstream jq pipelines brittle.

🧩 Suggested fix

if args.scan: events = scan_recent_sessions(args.project) - if events: - # Output as JSON array - print(json.dumps([asdict(e) for e in events], indent=2)) + # Always output JSON array (possibly empty) + print(json.dumps([asdict(e) for e in events], indent=2))

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if args.scan:

events = scan_recent_sessions(args.project)

if events:

# Output as JSON array

print(json.dumps([asdict(e) for e in events], indent=2))

elif args.session:

if args.scan:

events = scan_recent_sessions(args.project)

# Always output JSON array (possibly empty)

print(json.dumps([asdict(e) for e in events], indent=2))

elif args.session:

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugins/metrics/scripts/ai_docs_telemetry.py` around lines 236 - 241, The -scan branch currently only prints JSON when events is truthy; change it so it always emits a JSON array (possibly empty) from scan_recent_sessions(args.project) — call scan_recent_sessions into events and unconditionally print json.dumps([asdict(e) for e in events], indent=2) even if events is empty, ensuring downstream jq pipelines always receive valid JSON; update the block around args.scan, scan_recent_sessions, events and the asdict conversion accordingly.

Added session_scraper.py following PR openshift-eng#450 pattern to extract file access patterns from Claude Code JSONL session logs. Features: - Scrapes ~/.claude/projects/**/*.jsonl files - Extracts file access patterns, navigation sequences, timing data - Identifies entry points (AGENTS.md vs direct search) - Aggregates metrics across multiple sessions - Exports structured JSON for analysis Implementation: - lib/metrics/session_scraper.py (417 lines) - SessionScraper class with session file parsing - FileAccess, NavigationSequence, SessionTelemetry dataclasses - Aggregate metrics calculation - JSON export functionality Testing: - tests/test_session_scraper.py (6 tests, all passing) - test_is_agentic_doc_path - test_extract_file_access - test_scrape_session_file - test_navigation_sequences - test_aggregate_metrics - test_export_to_json Documentation: - Updated README.md with session scraping usage examples - Updated TEST_REPORT.md to mark enhancement as complete This completes the optional enhancement from REFACTOR_MAY_8.md Task 5. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

openshift-ci · 2026-05-18T09:15:40Z

PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

openshift-ci Bot requested review from bryan-cox and enxebre May 6, 2026 23:27

openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 6, 2026

coderabbitai Bot reviewed May 6, 2026

View reviewed changes

kenjpais added a commit to kenjpais/ai-helpers that referenced this pull request May 14, 2026

Add metrics plugin from PR openshift-eng#450 for session telemetry

3bc9046

openshift-ci Bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 18, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ai-docs telemetry analysis to metrics plugin#450

Add ai-docs telemetry analysis to metrics plugin#450
Prashanth684 wants to merge 1 commit into
openshift-eng:mainfrom
Prashanth684:agentic-docs-metrics

Prashanth684 commented May 6, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

openshift-ci Bot commented May 6, 2026

Uh oh!

coderabbitai Bot commented May 6, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 6, 2026

Uh oh!

coderabbitai Bot May 6, 2026

Uh oh!

coderabbitai Bot May 6, 2026

Uh oh!

coderabbitai Bot May 6, 2026

Uh oh!

openshift-ci Bot commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Prashanth684 commented May 6, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Scan all recent sessions (last 7 days)

Scan specific project

Analyze specific session

Pipe to jq for analysis

Summary by CodeRabbit

Uh oh!

openshift-ci Bot commented May 6, 2026

Uh oh!

coderabbitai Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

openshift-ci Bot commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Prashanth684 commented May 6, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 6, 2026 •

edited

Loading