Skip to content

Latest commit

 

History

History
281 lines (201 loc) · 10.2 KB

File metadata and controls

281 lines (201 loc) · 10.2 KB

goodboy vs. The World

How Does This Compare to Existing Tools?


vs. Traditional BDD Tools

Cucumber/SpecFlow/Behave goodboy
Who writes specs Developers (in practice) Anyone
Language Gherkin DSL Natural conversation
Learning curve Learn syntax, structure, keywords Just describe what should happen
Maintenance Manual step definitions Auto-generated
Verification Run tests manually Automatic on every behavior
Non-technical friendly "Supposed to be" Actually is
Output Terminal text Visual HTML diagrams

Bottom line: BDD promised business-readable specs. But non-technical people never actually wrote Cucumber scenarios because you still need to understand Given/When/Then, step definitions, and test frameworks. We deliver on the promise using AI as the translator.


vs. AI Agent Frameworks

vs. obra/superpowers

Superpowers goodboy
Target user Developers Anyone (especially non-technical)
Output format Code, plans, specs Behavioral maps, visual diagrams
Enforcement Prompt-based (HARD-GATE) Hook-based (mechanical blocking)
Artifacts DESIGN.md, PLAN.md .feature files (executable)
Skill focus Code quality, TDD, architecture Behavioral verification, stakeholder communication
Relationship Inspiration Standalone plugin

Comparison: goodboy is a standalone Claude Code plugin. It adds a "stakeholder track" to any project. Both tracks use the same codebase and test suite, just different interfaces.

vs. AutoGen/CrewAI/LangChain

Multi-Agent Frameworks goodboy
Focus Agent orchestration Human-agent communication
Abstraction Agents with roles/tools Behavioral verification layer
Output Whatever agents produce Strictly behavioral
Testing Not built-in Core feature
Non-technical use Not addressed Primary use case

Potential integration: goodboy's enforcement patterns could be ported to these frameworks. The behavioral verification layer is framework-agnostic.


vs. Visual Agent Tools

vs. nicobailon/visual-explainer

Visual Explainer goodboy
Purpose Make output prettier Enforce behavioral thinking
Timing After agent responds Before agent responds
Verification None Core feature
Accumulation One-shot HTML Living .feature files
Relationship We use its patterns We extend the concept

We adopted: HTML generation, Mermaid diagrams, auto-trigger on complexity
We added: Behavioral mapping, verification loop, persistent specs

vs. Rivet (Ironclad)

Rivet goodboy
Interface Visual node editor Natural conversation
Audience Technical (understands flows) Non-technical
Output Agent workflows Behavioral specs
Testing Not the focus Core feature

Different tools for different jobs. Rivet lets you build agent flows. We let non-technical people describe system behaviors without seeing flows.


vs. Traditional Requirements Tools

vs. Jira/Linear/Asana

Project Management Tools goodboy
Spec format Free-form tickets Structured behavioral maps
Verification Manual testing Automatic
Executable No Yes (.feature files)
Ambiguity Common (different interpretations) Reduced (agent asks clarifying questions)
Developer handoff "Read this ticket" "Here's a failing test"

Common workflow today:

  1. PM writes ticket in Jira
  2. Developer reads ticket
  3. Developer misunderstands
  4. Developer builds wrong thing
  5. PM says "that's not what I meant"
  6. Loop back to step 1

With goodboy:

  1. PM describes behavior conversationally
  2. Agent maps it (visual)
  3. PM confirms or corrects
  4. Behavior becomes failing test
  5. Developer fixes to make test pass
  6. PM sees "behavior now passing ✓"

vs. Confluence/Notion Documentation

Documentation Tools goodboy
Format Docs/wiki pages Behavioral flows
Truth Gets out of date Always in sync (tests fail if not)
Executable No Yes
Discovery Search/browse Natural language queries

The problem with docs: They rot. Code changes, docs don't.
Our solution: The .feature file IS the code contract. If they diverge, tests fail.


vs. Test Generation Tools

vs. GitHub Copilot / Cursor (Test Generation)

AI Test Generators goodboy
Who uses it Developers Anyone
Input Code (generate tests from code) Behavior (generate tests from description)
Output Test code Behavioral verification + test code
Direction Code → Tests Behavior → Tests → Code
Non-technical No Yes

Different direction of flow.

  • Copilot: "Here's my code, write tests for it"
  • goodboy: "Here's what should happen, verify it does"

vs. Claude Projects / Custom GPTs

Custom Instructions goodboy
Enforcement Prompt-based (can be ignored) Hook-based (mechanically enforced)
Structure Unstructured conversation Behavioral mapping framework
Accumulation Manual (copy/paste) Automatic (.feature files)
Testing Not integrated Built-in
Portability Platform-specific Works across agents

You could approximate this with clever prompts. But without hook enforcement, the agent will eventually slip into showing code. And without the accumulation layer, you're not building a persistent spec.


vs. Fabric (Daniel Miessler)

Fabric goodboy
Focus Personal workflow automation System behavior specification
Patterns Reusable prompt templates Reusable behavioral verifications
Audience Technical individuals Teams (technical + non-technical)
Testing Not the focus Core feature

Similar philosophy (structured patterns), different application.
Fabric: "Here are patterns for common AI tasks"
goodboy: "Here's a pattern for behavior-first development"


vs. MCP (Model Context Protocol)

MCP goodboy
Layer Tool integration standard Communication standard
Problem How agents connect to tools How agents talk to non-technical users
Relationship Complementary Could use MCP for tool integrations

Not competing, different layers. MCP standardizes how agents connect to Slack, GitHub, databases. goodboy standardizes how agents communicate with stakeholders.


Key Differentiators

1. Mandatory Verification

Most tools suggest or encourage behavioral thinking. We enforce it.
The agent literally cannot respond without completing a valid behavioral map.

2. Mechanical Enforcement

Not "please use behavioral language" (prompt).
But "you cannot output code" (hook blocks it at the system level).

3. Non-Technical First

Every other framework assumes technical literacy.
We assume nothing. If you can describe what should happen, you can use this.

4. Living Executable Specs

Not docs that get stale. Not tests divorced from requirements.
The behavioral spec IS the test suite. They can't diverge.

5. Verification as Communication

Other tools: agent does work, shows you result
goodboy: agent must prove its thinking behaviorally before responding
The behavioral map catches flawed reasoning, not just flawed output.


When NOT to Use goodboy

You're writing infrastructure code

If there's no user-visible behavior (building a database migration, optimizing queries, refactoring internal APIs), this tool isn't the right fit. Use regular Claude Code skills.

You prefer seeing code

Some developers think better reading code than behavioral descriptions. That's fine! This is an additional interface, not a replacement.

You're prototyping rapidly

The behavioral verification adds overhead (10-30% slower responses). If you're in early exploration mode and just want fast answers, disable behavior-first mode until you're ready to solidify specs.

Your codebase has no tests

goodboy generates tests and runs them. If your project doesn't have a test setup, you'll need to bootstrap that first (though the agent can help with that too).


When TO Use goodboy

✅ Cross-functional collaboration

PMs, designers, QA, engineers all need to agree on behavior

✅ Customer-facing features

Where user experience is critical and needs to be specified precisely

✅ Regulatory/compliance work

Where behavioral specs need to be documented and verified

✅ Onboarding new team members

The .feature files explain what the system does without code knowledge

✅ Legacy system documentation

Describe behaviors conversationally, agent generates specs + tests

✅ Stakeholder demos

Show behavioral maps, not code. Everyone understands.


The Unique Position

                    Technical Literacy Required
                              ↑
                              |
         AutoGen, CrewAI ●    |    ● Superpowers
         LangChain        ●   |    ● GitHub Copilot
                              |
    ----------------------- Developer Line -----------------------
                              |
                              |    ● goodboy
                              |      (You are here)
                              |
    Jira, Notion ●           |
    Confluence   ●           |
                              ↓
                    No Technical Knowledge Needed

Everyone else targets the top half. We're the only tool designed for the bottom half while maintaining the same code quality and test coverage as developer tools.


The Ecosystem Fit

goodboy isn't replacing anything. It's the missing piece:

  • Requirements tools (Jira) → goodboyDevelopment tools (Claude Code)
  • Stakeholder languagegoodboyCode
  • Behavioral specsgoodboyTests

It's the bridge between "what we want" and "what we built."


Choose the right tool for the job. For behavior specification with non-technical stakeholders, we think we're the right tool.