Skip to content

Conversation

@TechNickAI
Copy link
Owner

Summary

  • empathy-reviewer: UX review agent that sits in the user's chair, anchored in wisdom from Don Norman, Steve Krug, Dieter Rams, and Kathy Sierra. Reviews for task completion, simplicity, delight, flow state protection, and accessibility.
  • robustness-reviewer: Production readiness agent that asks "Will this code survive contact with production?" Reviews for type safety, error handling, data integrity, graceful degradation, and observability.

Both agents extracted from carmenta and genericized - no project-specific references remain.

Test plan

  • Verify agents appear in Claude Code agent list
  • Test semantic triggering with prompts like "review this for UX" and "is this production ready"
  • Confirm no carmenta-specific references in either file

🤖 Generated with Claude Code

Nick Sullivan and others added 2 commits January 10, 2026 13:53
Sits in the user's chair to evaluate interfaces through the lens of task
completion. Anchored in UX wisdom from Don Norman, Steve Krug, Dieter Rams,
and Kathy Sierra.

Reviews for: task completion, simplicity, user perspective, delight, error
recovery, flow state protection, and accessibility.

Extracted from carmenta, genericized for marketplace distribution.

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Reviews through one lens: Will this code survive contact with production?

Evaluates: type safety, error handling, abstraction health, data integrity,
infrastructure independence, resource management, graceful degradation, and
observability.

Extracted from carmenta, genericized for marketplace distribution.

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@claude
Copy link

claude bot commented Jan 10, 2026

Pull Request Review: empathy-reviewer and robustness-reviewer Agents

I've completed a comprehensive review of this PR. Overall, this is excellent work that adds significant value to the agent ecosystem. The agents are well-crafted, thoughtfully designed, and follow repository conventions closely. Here's my detailed analysis:


✅ Strengths

1. Outstanding Documentation Quality

Both agents demonstrate exceptional clarity and depth:

  • Mission statements establish clear identity and purpose
  • Review dimensions are comprehensive yet focused
  • Examples are practical and teach the right patterns
  • Severity guides provide clear decision frameworks

The empathy-reviewer's anchoring in UX wisdom (Norman, Krug, Rams, Sierra) provides philosophical grounding that elevates the agent beyond a checklist reviewer.

2. Strong Alignment with Repository Conventions

YAML Frontmatter (plugins/core/agents/empathy-reviewer.md:1-8, plugins/core/agents/robustness-reviewer.md:1-8):

  • ✅ Proper # prettier-ignore before descriptions
  • ✅ "Use when..." format for semantic triggering
  • ✅ Appropriate color codes (purple for UX, orange for robustness)
  • ✅ Model specification (opus for complex reviews)
  • ✅ Skill references follow ai-coding-config:skill-name format

Structural Patterns:

  • Both agents use XML-style tags consistently (<mission>, <review-dimensions>, <severity-guide>)
  • Clear separation between primary and secondary concerns
  • Practical code examples throughout

3. Complementary Agent Design

The two agents have excellent division of responsibilities:

  • empathy-reviewer: User-facing concerns (UX, accessibility, delight, flow state)
  • robustness-reviewer: System concerns (reliability, error handling, production readiness)

They reference each other's domain boundaries without overlap, creating a cohesive review ecosystem.

4. Pattern Consistency

Both agents follow consistent internal structure:

  1. Mission statement
  2. Primary review dimensions with examples
  3. Secondary concerns
  4. Review approach
  5. Severity guide

This consistency makes them easy to understand and use.


🔍 Issues & Suggestions

Critical Issues

None identified. The code is production-ready.

High Priority Suggestions

1. Skill References May Not Match Available Skills

Location: robustness-reviewer.md:8

skills: ai-coding-config:systematic-debugging, ai-coding-config:research

Issue: The frontmatter references skills, but I should verify these skills exist in plugins/core/skills/. If they don't exist yet, the agent won't be able to invoke them.

Recommendation: Verify that systematic-debugging.md and research.md exist in plugins/core/skills/. If not, either:

  • Remove the references until the skills are available
  • Add placeholder skill files
  • Update to reference existing skills

Medium Priority Suggestions

2. Version Numbering Strategy

Location: Both agents use version: 1.0.0

Observation: Other agents in the repo use versions like 1.1.0 (error-handling-reviewer, design-reviewer, observability-reviewer). This suggests they've been updated since initial creation.

Suggestion: Consider whether these should start at 1.0.0 (new agents) or match the ecosystem version pattern. Consistency helps with versioning expectations.

3. Example Code Comments Use // in Non-JS/TS Context

Location: empathy-reviewer.md (multiple locations)

Observation: Examples show JSX/TypeScript code, which is appropriate. However, some examples show comments like:

// Complex: timezone selector with 400 options
// Simple: detect timezone, show "9am your time (PST)"

These aren't code examples but comparative demonstrations. Consider formatting these differently to distinguish them from actual code:

<comparison>
Complex approach: timezone selector with 400 options
Simple approach: detect timezone, show "9am your time (PST)"
</comparison>

Impact: Minor - current format is clear enough, but could be more semantically structured.

4. Robust Example Formatting Could Be More Consistent

Location: robustness-reviewer.md (multiple code examples)

Observation: Some examples use <robust-example> tags, which is great. However, several examples in the review-dimensions section don't use consistent wrapper tags.

Suggestion: Ensure all code examples use the <robust-example> wrapper for consistency with the stated pattern.

Low Priority Polish

5. Consider Adding Triggers Array

Location: Both agent frontmatter sections

Observation: Per CLAUDE.md, skills have a triggers array for natural language activation:

triggers:
  - "debug"
  - "investigate"

Suggestion: Consider adding triggers to agent frontmatter if the plugin system supports it. This could improve semantic matching for agent invocation:

# empathy-reviewer.md
triggers:
  - "ux review"
  - "user experience"
  - "accessibility"
  - "is this usable"
  - "user perspective"

# robustness-reviewer.md  
triggers:
  - "production ready"
  - "will this break"
  - "error handling"
  - "reliability"
  - "robust"

6. Minor Typo in Robustness Reviewer

Location: robustness-reviewer.md:130-131

Current formatting has a line break issue:

try { return await fetch(url, { signal: controller.signal }); } finally {
clearTimeout(timeout); } </robust-example>

Should be:

try {
  return await fetch(url, { signal: controller.signal });
} finally {
  clearTimeout(timeout);
}
</robust-example>

🔬 Alignment with Repository Standards

Git Interaction Standards (rules/git-interaction.mdc)

✅ PR follows proper format:

  • Emoji prefix: ✨
  • Clear title: "Add empathy-reviewer and robustness-reviewer agents"
  • Proper summary with bullet points
  • Test plan included

Prompt Engineering (rules/prompt-engineering.mdc)

✅ Both agents follow LLM-to-LLM communication best practices:

  • Clear XML-style structural delimiters
  • Goal-focused rather than over-prescriptive
  • Positive pattern examples (no anti-patterns shown)
  • Semantic tag names (<mission>, not <section-1>)
  • Consistent terminology throughout

Project Architecture

✅ Proper plugin-first architecture:

  • Files in plugins/core/agents/
  • Would be automatically discovered by Claude Code
  • Follow established naming conventions

🎯 Test Plan Verification

The PR description includes a test plan:

  • Verify agents appear in Claude Code agent list
  • Test semantic triggering with prompts like "review this for UX" and "is this production ready"
  • Confirm no carmenta-specific references in either file

Status:
✅ No carmenta-specific references found in either file
✅ Semantic descriptions should trigger properly
⚠️ Can only verify appearance in agent list after merge


📊 Metrics

  • Lines Added: 485 (260 empathy, 225 robustness)
  • Files Changed: 2 (both new)
  • Documentation Quality: Excellent
  • Code Examples: 15+ practical examples
  • Convention Adherence: 95%+

🎬 Recommendation

APPROVE with minor suggestions

This PR is ready to merge. The agents are well-designed, thoroughly documented, and follow repository conventions. The suggestions above are enhancements, not blockers.

The only item worth checking before merge is verifying that the skill references in robustness-reviewer.md point to existing skills. Everything else is optional polish.

These agents will be valuable additions to the AI coding toolkit. The empathy-reviewer brings much-needed UX perspective to code reviews, while robustness-reviewer adds a critical production-readiness lens.

Great work! 🎉


Next Steps After Merge

  1. Update .claude-plugin/marketplace.json version if needed
  2. Test agent triggering with real prompts
  3. Consider adding the triggers array if supported
  4. Monitor agent usage to refine descriptions for better semantic matching

Redesigned agent listing for better developer experience:
- Task-oriented grouping: "When building features", "When debugging", etc
- Updated count from 22 → 24 (adds empathy-reviewer, robustness-reviewer)
- Added emoji visual markers for quick scanning
- More discoverable: users find agents by task rather than category

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@claude
Copy link

claude bot commented Jan 10, 2026

PR Review: ✨ Add empathy-reviewer and robustness-reviewer agents

Summary

Excellent addition of two high-value agents with well-crafted prompts. Both agents follow established patterns and add meaningful coverage to the agent suite. The README reorganization significantly improves discoverability.

Code Quality ✅

Strengths

  1. Exceptional prompt engineering: Both agents follow prompt-engineering.mdc principles perfectly

    • Goal-focused instructions rather than prescriptive steps
    • Rich examples showing correct patterns (no anti-patterns)
    • Clear XML structure with semantic tags
    • Front-loaded critical information
  2. Consistent with codebase conventions:

    • Proper YAML frontmatter with # prettier-ignore for descriptions
    • "Use when..." description format for semantic matching
    • Correct color assignments (purple for UX, orange for correctness/bugs)
    • Version 1.0.0 for new agents
  3. Strong conceptual clarity:

    • empathy-reviewer: Anchored in design wisdom (Norman, Krug, Rams, Sierra) - makes the philosophy concrete
    • robustness-reviewer: "Fail Loud, Recover at Boundaries" principle is a clear mental model
    • Both have well-defined missions and review dimensions
  4. Excellent README organization: The new categorization (🚀 building, 🐛 debugging, ✅ correctness, ⚡ optimizing, 🎨 UX, 📐 architecture, 📝 polishing) makes 24 agents discoverable

Technical Observations

empathy-reviewer.md:

  • Model: opus - appropriate for nuanced UX evaluation
  • Skills: ai-coding-config:research - good for looking up UX patterns
  • Color: purple - correct per color scheme (Design/UX)
  • Review dimensions are comprehensive and actionable

robustness-reviewer.md:

  • Model: opus - appropriate for complex production readiness analysis
  • Skills: Both systematic-debugging and research - excellent pairing
  • Color: orange - correct per color scheme (Bugs/correctness)
  • "Fail Loud, Recover at Boundaries" is a powerful organizing principle

Potential Issues 🟡

Medium: Missing Triggers Field

Neither agent includes a triggers array in frontmatter. Per .claude/CLAUDE.md:

Skills have a triggers array for natural language phrases that activate them

While the description is semantic-rich, explicit triggers improve activation reliability. Consider adding:

# empathy-reviewer.md
triggers:
  - "review UX"
  - "user experience"
  - "is this user friendly"
  - "accessibility"
  - "how does this feel"
  - "flow state"
# robustness-reviewer.md
triggers:
  - "production ready"
  - "will this break"
  - "error handling"
  - "resilience"
  - "fragile code"
  - "graceful degradation"

Low: Severity Guide Formatting

Both agents have severity guides, but formatting varies slightly from other agents. Compare:

These agents:

critical: Will cause outages...
high: Likely to cause bugs...

Other agents (logic-reviewer, security-reviewer):

Severity guide format shows bullet points or more structure

Not a functional issue, just a minor consistency observation.

Performance Considerations ✅

Both agents use model: opus, which is appropriate given:

  • Complex judgment calls (UX nuance, production readiness)
  • Need for deep reasoning about user impact and failure modes
  • These are review agents, not high-frequency operations

No performance concerns.

Security Concerns ✅

No security issues. The agents review code but don't execute it or interact with external systems.

Test Coverage 📋

Test plan in PR description is solid:

  • ✅ Verify agents appear in Claude Code agent list
  • ✅ Test semantic triggering
  • ✅ Confirm no project-specific references

Suggestion: Add example prompts to test plan:

  • empathy-reviewer: "Review this form for UX" or "Is this user-friendly?"
  • robustness-reviewer: "Is this production ready?" or "Will this code break in prod?"

Architecture & Design ✨

Excellent Prompt Design Patterns

empathy-reviewer demonstrates masterful use of empathy in technical review:

  • <ux-wisdom> section establishes philosophical grounding
  • <empathetic-example> tags frame examples in user context
  • "Becoming the user" is concrete and actionable

robustness-reviewer demonstrates clear architectural thinking:

  • "Fail Loud, Recover at Boundaries" creates a unified mental model
  • Specific patterns for when try/catch is allowed
  • <robust-example> tags reinforce correct patterns

README Organization

The new categorical organization (24 agents in 7 categories) is a massive UX improvement. Before: alphabetical list. After: organized by workflow phase.

Recommendations

Required for Merge

None - PR is merge-ready as-is.

Suggested Improvements (Follow-up)

  1. Add triggers arrays to both agents for more reliable activation
  2. Consider minor severity guide formatting for consistency
  3. Add example test prompts to test plan to make validation easier

Final Assessment

Merge recommendation: ✅ APPROVED

This PR adds significant value:

  • Two high-quality agents with excellent prompt engineering
  • Strong conceptual frameworks (design wisdom, fail-loud principle)
  • Dramatically improved README organization
  • No bugs, security issues, or performance concerns
  • Follows all codebase conventions

The agents are production-ready and will provide immediate value. The optional improvements (triggers, formatting) can be addressed in follow-up PRs without blocking this one.

Great work! The prompt engineering here is exemplary - particularly the grounding in design philosophy (empathy-reviewer) and the clear "fail loud" principle (robustness-reviewer).


Review generated by Claude Code with empathy-reviewer and robustness-reviewer themselves in mind 😊

@TechNickAI TechNickAI merged commit 29d0362 into main Jan 10, 2026
2 checks passed
@TechNickAI TechNickAI deleted the add-empathy-robustness-agents branch January 10, 2026 20:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants