| Cucumber/SpecFlow/Behave | goodboy | |
|---|---|---|
| Who writes specs | Developers (in practice) | Anyone |
| Language | Gherkin DSL | Natural conversation |
| Learning curve | Learn syntax, structure, keywords | Just describe what should happen |
| Maintenance | Manual step definitions | Auto-generated |
| Verification | Run tests manually | Automatic on every behavior |
| Non-technical friendly | "Supposed to be" | Actually is |
| Output | Terminal text | Visual HTML diagrams |
Bottom line: BDD promised business-readable specs. But non-technical people never actually wrote Cucumber scenarios because you still need to understand Given/When/Then, step definitions, and test frameworks. We deliver on the promise using AI as the translator.
| Superpowers | goodboy | |
|---|---|---|
| Target user | Developers | Anyone (especially non-technical) |
| Output format | Code, plans, specs | Behavioral maps, visual diagrams |
| Enforcement | Prompt-based (HARD-GATE) | Hook-based (mechanical blocking) |
| Artifacts | DESIGN.md, PLAN.md | .feature files (executable) |
| Skill focus | Code quality, TDD, architecture | Behavioral verification, stakeholder communication |
| Relationship | Inspiration | Standalone plugin |
Comparison: goodboy is a standalone Claude Code plugin. It adds a "stakeholder track" to any project. Both tracks use the same codebase and test suite, just different interfaces.
| Multi-Agent Frameworks | goodboy | |
|---|---|---|
| Focus | Agent orchestration | Human-agent communication |
| Abstraction | Agents with roles/tools | Behavioral verification layer |
| Output | Whatever agents produce | Strictly behavioral |
| Testing | Not built-in | Core feature |
| Non-technical use | Not addressed | Primary use case |
Potential integration: goodboy's enforcement patterns could be ported to these frameworks. The behavioral verification layer is framework-agnostic.
| Visual Explainer | goodboy | |
|---|---|---|
| Purpose | Make output prettier | Enforce behavioral thinking |
| Timing | After agent responds | Before agent responds |
| Verification | None | Core feature |
| Accumulation | One-shot HTML | Living .feature files |
| Relationship | We use its patterns | We extend the concept |
We adopted: HTML generation, Mermaid diagrams, auto-trigger on complexity
We added: Behavioral mapping, verification loop, persistent specs
| Rivet | goodboy | |
|---|---|---|
| Interface | Visual node editor | Natural conversation |
| Audience | Technical (understands flows) | Non-technical |
| Output | Agent workflows | Behavioral specs |
| Testing | Not the focus | Core feature |
Different tools for different jobs. Rivet lets you build agent flows. We let non-technical people describe system behaviors without seeing flows.
| Project Management Tools | goodboy | |
|---|---|---|
| Spec format | Free-form tickets | Structured behavioral maps |
| Verification | Manual testing | Automatic |
| Executable | No | Yes (.feature files) |
| Ambiguity | Common (different interpretations) | Reduced (agent asks clarifying questions) |
| Developer handoff | "Read this ticket" | "Here's a failing test" |
Common workflow today:
- PM writes ticket in Jira
- Developer reads ticket
- Developer misunderstands
- Developer builds wrong thing
- PM says "that's not what I meant"
- Loop back to step 1
With goodboy:
- PM describes behavior conversationally
- Agent maps it (visual)
- PM confirms or corrects
- Behavior becomes failing test
- Developer fixes to make test pass
- PM sees "behavior now passing ✓"
| Documentation Tools | goodboy | |
|---|---|---|
| Format | Docs/wiki pages | Behavioral flows |
| Truth | Gets out of date | Always in sync (tests fail if not) |
| Executable | No | Yes |
| Discovery | Search/browse | Natural language queries |
The problem with docs: They rot. Code changes, docs don't.
Our solution: The .feature file IS the code contract. If they diverge, tests fail.
| AI Test Generators | goodboy | |
|---|---|---|
| Who uses it | Developers | Anyone |
| Input | Code (generate tests from code) | Behavior (generate tests from description) |
| Output | Test code | Behavioral verification + test code |
| Direction | Code → Tests | Behavior → Tests → Code |
| Non-technical | No | Yes |
Different direction of flow.
- Copilot: "Here's my code, write tests for it"
- goodboy: "Here's what should happen, verify it does"
| Custom Instructions | goodboy | |
|---|---|---|
| Enforcement | Prompt-based (can be ignored) | Hook-based (mechanically enforced) |
| Structure | Unstructured conversation | Behavioral mapping framework |
| Accumulation | Manual (copy/paste) | Automatic (.feature files) |
| Testing | Not integrated | Built-in |
| Portability | Platform-specific | Works across agents |
You could approximate this with clever prompts. But without hook enforcement, the agent will eventually slip into showing code. And without the accumulation layer, you're not building a persistent spec.
| Fabric | goodboy | |
|---|---|---|
| Focus | Personal workflow automation | System behavior specification |
| Patterns | Reusable prompt templates | Reusable behavioral verifications |
| Audience | Technical individuals | Teams (technical + non-technical) |
| Testing | Not the focus | Core feature |
Similar philosophy (structured patterns), different application.
Fabric: "Here are patterns for common AI tasks"
goodboy: "Here's a pattern for behavior-first development"
| MCP | goodboy | |
|---|---|---|
| Layer | Tool integration standard | Communication standard |
| Problem | How agents connect to tools | How agents talk to non-technical users |
| Relationship | Complementary | Could use MCP for tool integrations |
Not competing, different layers. MCP standardizes how agents connect to Slack, GitHub, databases. goodboy standardizes how agents communicate with stakeholders.
Most tools suggest or encourage behavioral thinking. We enforce it.
The agent literally cannot respond without completing a valid behavioral map.
Not "please use behavioral language" (prompt).
But "you cannot output code" (hook blocks it at the system level).
Every other framework assumes technical literacy.
We assume nothing. If you can describe what should happen, you can use this.
Not docs that get stale. Not tests divorced from requirements.
The behavioral spec IS the test suite. They can't diverge.
Other tools: agent does work, shows you result
goodboy: agent must prove its thinking behaviorally before responding
The behavioral map catches flawed reasoning, not just flawed output.
If there's no user-visible behavior (building a database migration, optimizing queries, refactoring internal APIs), this tool isn't the right fit. Use regular Claude Code skills.
Some developers think better reading code than behavioral descriptions. That's fine! This is an additional interface, not a replacement.
The behavioral verification adds overhead (10-30% slower responses). If you're in early exploration mode and just want fast answers, disable behavior-first mode until you're ready to solidify specs.
goodboy generates tests and runs them. If your project doesn't have a test setup, you'll need to bootstrap that first (though the agent can help with that too).
PMs, designers, QA, engineers all need to agree on behavior
Where user experience is critical and needs to be specified precisely
Where behavioral specs need to be documented and verified
The .feature files explain what the system does without code knowledge
Describe behaviors conversationally, agent generates specs + tests
Show behavioral maps, not code. Everyone understands.
Technical Literacy Required
↑
|
AutoGen, CrewAI ● | ● Superpowers
LangChain ● | ● GitHub Copilot
|
----------------------- Developer Line -----------------------
|
| ● goodboy
| (You are here)
|
Jira, Notion ● |
Confluence ● |
↓
No Technical Knowledge Needed
Everyone else targets the top half. We're the only tool designed for the bottom half while maintaining the same code quality and test coverage as developer tools.
goodboy isn't replacing anything. It's the missing piece:
- Requirements tools (Jira) → goodboy → Development tools (Claude Code)
- Stakeholder language → goodboy → Code
- Behavioral specs → goodboy → Tests
It's the bridge between "what we want" and "what we built."
Choose the right tool for the job. For behavior specification with non-technical stakeholders, we think we're the right tool.