Skip to content

OpenCodeIntel/saar

Repository files navigation


PyPI Downloads Tests License: MIT Python 3.10+


Typing SVG


getsaar.com  ·  Docs  ·  PyPI  ·  Issues  ·  OCI



What is saar?

saar is a CLI that analyzes your codebase and writes an AGENTS.md: a precise context file that every AI coding tool reads automatically.

One command. Claude Code, Cursor, claude.ai, Copilot, Gemini CLI. They all stop guessing and start knowing.


The problem

I asked Claude to install a package. It said npm install. My project uses bun. The build broke. I spent 20 minutes confused.

This happens to every developer using AI tools. Every week.

  • AI writes npm install in your bun repo
  • AI invents a new exception class when you already have 218
  • AI uses the wrong auth decorator from 10 available options
  • AI uses import logging when your team standardized on structlog
  • Every session starts from zero. No memory of how your project actually works.

The fix exists: a context file that tells the AI exactly how your codebase works. But writing one well is hard, they go stale fast, and nobody maintains them.

saar automates the hard part.


Quick start

pip install saar
cd your-project
saar extract .

Done. AGENTS.md is in your project root. Every AI tool picks it up automatically.

What you see when it runs:

saar analyzing your-project...

  Backend     FastAPI  Python (47 files)
  Frontend    React  TypeScript  Vite  bun
  Auth        get_current_active_superuser  (from app.api.deps)
  Logging     structlog
  Exceptions  APIError, AuthenticationError, LimitCheckError (+6 more)
  Scale       694 functions  276 files  96% typed

  wrote AGENTS.md  (72 lines)
  Claude knows your project.

saar found your auth pattern, your logging library, your exception classes, and your package manager. You didn't tell it any of that.


How it works

your repo
    |
    v
saar extract .
    |
    +-- static analysis ------- detects stack, auth, logging, naming, exceptions
    |
    +-- guided interview ------- 5 questions for tribal knowledge:
    |                             off-limits files, domain terms, team gotchas
    |
    +-- AGENTS.md -------------- ~100 lines, picked up automatically by:
                                  Claude Code, Cursor, claude.ai, Copilot, Gemini CLI

saar generates short, precise files. Not 300-line dumps. ETH Zurich (Feb 2026, arxiv:2602.11988) showed that long LLM-generated context files reduce task success and increase costs 20%+. saar's default is 100 lines. Focused. Nothing wasted.


Before / After

Without saar (claude.ai, no context):

Q: Add debug logging to the Python endpoint.

import logging
logger = logging.getLogger(__name__)

Wrong. This codebase uses structlog.

With saar (same question, AGENTS.md loaded):

Q: Add debug logging to the Python endpoint.

import structlog
logger = structlog.get_logger(__name__)
# structlog: structured JSON output, standard for this project

Right. First try. No back-and-forth.

This is a real test result from a controlled eval on the PostHog codebase. 174 Python files use import logging. Claude follows the majority without context. AGENTS.md overrides it.


When your codebase changes, saar tells you

saar diff .
saar checking your-project for changes...

  AGENTS.md last generated: 14 days ago

  Changed since last extract:
  ~ Package manager changed: npm -> bun
  + New exception class: RateLimitError
  + New auth pattern detected

  Run saar extract . to update.

Your AGENTS.md was telling Claude to use npm. saar caught it before you committed broken code.


Keep corrections over time

AI gets something wrong? Add it once. Never see that mistake again.

saar add "Never use npm, this project uses bun"
saar add --off-limits "billing/ -- legacy Stripe integration, frozen until Q3"
saar add --domain "Workspace = tenant, not a directory"
saar add --verify "source venv/bin/activate && pytest tests/ -v"

No re-analysis. Each correction appends to .saar/config.json and gets included next time you run saar extract.


saar vs everything else

Feature saar /init (Claude Code) manual
Detects package manager basic you write it
Detects logging library you write it
Detects auth patterns basic you write it
Detects exception classes you write it
Tribal knowledge interview you know it
Output size ~100 lines 300+ lines up to you
Staleness detection (saar diff)
Quality linting (saar lint)
Works with all AI tools Claude only
Free + fully local

Claude Code's /init is useful. But it generates bloated files that ETH Zurich showed hurt performance. saar generates focused files and keeps them honest over time.


All commands

# Generate
saar extract .                          # AGENTS.md (default, ~100 lines)
saar extract . --format claude          # CLAUDE.md
saar extract . --format cursorrules     # .cursorrules
saar extract . --format all             # all formats at once
saar extract . --no-interview           # skip questions, use cached answers
saar extract . --verbose                # remove 100-line cap, full output
saar extract . --include packages/api   # monorepo subset

# Maintain
saar diff .                             # detect what changed since last extract
saar add "rule"                         # add correction without re-running
saar add --off-limits "path/"           # mark file/dir as off-limits for AI
saar add --domain "term = definition"   # add domain vocabulary
saar add --verify "command"             # set the verification workflow

# Quality
saar lint .                             # check AGENTS.md for SA001-SA005 violations
saar stats .                            # score your AGENTS.md (0-100)
saar check .                            # CI primitive: exits 1 if stale or incomplete

# AI enrichment (requires ANTHROPIC_API_KEY)
saar enrich                             # use Claude to sharpen raw interview answers

# OCI integration
saar extract . --index                  # generate AGENTS.md + index into OCI

saar lint

saar lint .

  AGENTS.md:5:1:  SA004  Generic filler: 'Write clean code' -- AI already knows this
  AGENTS.md:12:1: SA001  Duplicate rule: already appears on line 3

  Found 2 violations.  Run saar stats . for a full quality score.

Like ruff, but for your context file. Catches:

  • SA001 duplicate rules
  • SA002 orphaned section headers
  • SA003 vague rules under 6 words
  • SA004 generic filler (write clean code, follow best practices)
  • SA005 emojis that waste instruction budget

saar check (CI)

# .github/workflows/ci.yml
- run: saar check .

Exits 0 if AGENTS.md is fresh and complete. Exits 1 with a specific message if not. Never let a stale context file slip into production.


OCI — semantic search via MCP

saar generates your AGENTS.md. OpenCodeIntel (OCI) indexes your codebase for per-task context via MCP.

saar extract . --index

Once indexed, Claude Desktop and Claude Code get a new tool:

codeintel:get_context_for_task("add rate limiting to the settings endpoints")

Returns:
  - backend/routes/settings.py (94% relevance)
  - backend/middleware/auth.py (81% relevance)
  - Rule: use LimitCheckError, not a new exception
  - Rule: require_auth on all user endpoints

Instead of exploring 30k tokens of files, Claude gets the exact 3 files and 2 rules for the task.

opencodeintel.com · MCP setup


What saar detects

Python — FastAPI / Flask / Django, auth middleware and decorators, logging library, exception class hierarchy, ORM patterns, naming conventions

TypeScript/JS — React / Next.js / Express, package manager (bun / pnpm / npm / yarn), TanStack Query / SWR patterns, component library, custom hooks, common imports

Both — critical files (most depended-on), circular dependencies, canonical examples per category, existing team rules (reads CLAUDE.md, .cursorrules, CONVENTIONS.md)


Installation

# Recommended
pipx install saar

# Standard
pip install saar

# With AI enrichment
pip install "saar[enrich]"
export ANTHROPIC_API_KEY=sk-ant-...

Requires Python 3.10+. No account. No API key for core features. Runs entirely on your machine.


Contributing

saar is MIT licensed. Everything is public: commits, decisions, benchmarks.

git clone https://github.com/OpenCodeIntel/saar.git
cd saar
python -m venv venv && source venv/bin/activate
pip install -e ".[dev]"

pytest tests/ -v        # 548 tests
ruff check saar/ tests/ # lint

# verify saar on itself
saar extract . --no-interview
saar lint .
saar stats .

Good first issues: good first issue

If you're building a feature, open an issue first. Saves everyone time.


RL Module — Adaptive Profile Learning

saar includes a self-contained reinforcement learning layer that learns which extraction profile best fits each codebase type — entirely offline, no external dependencies beyond numpy.

Install

pip install "saar[rl]"   # adds numpy>=1.24.0

Quick start

# 1. Train both agents offline (500 synthetic episodes each, ~0.2s)
saar rl train --agent both

# 2. Check training results
saar rl status

# 3. Run extraction with RL profile selection + online update
saar extract . --rl

# 4. Give explicit feedback to improve the policy
saar rate good   # or: saar rate bad

Architecture

┌─────────────────────────────────────────────────────────────┐
│                     saar RL Layer                           │
│                                                             │
│  CodebaseDNA ──► StateEncoder (20-D) ──► EnsembleAgent     │
│                                               │             │
│                              ┌────────────────┴──────────┐  │
│                              │  Thompson Sampling Meta    │  │
│                              │  Beta(α,β) per sub-agent   │  │
│                              └──────┬──────────┬──────────┘  │
│                                     │          │             │
│                            UCBBandit│    REINFORCE│          │
│                            6-context│    20→32→8 │          │
│                            UCB1     │    MLP+ReLU│          │
│                                     │          │             │
│                              ◄──────┴──────────┘            │
│                           action (profile 0–7)              │
│                                     │                       │
│            PROFILES[action] ──► RewardEngine                │
│            (depth multipliers)   (section coverage ×        │
│                                   multipliers → reward)     │
└─────────────────────────────────────────────────────────────┘

The 8 profiles

# Name Prioritises
0 Python backend auth, database, services, middleware
1 TypeScript / React frontend, naming, imports
2 Full-stack balanced api, frontend
3 Small script naming, imports
4 Monorepo services, tests, config
5 API microservice api, auth, middleware, errors
6 Data / ML imports, naming, config, logging
7 Legacy / mixed errors, logging, database

How the RL loop closes

  1. StateEncoder maps CodebaseDNA → 20-D feature vector (language mix, framework flags, scale, tribal richness)
  2. EnsembleAgent selects a profile via Thompson Sampling
  3. RewardEngine scores the DNA weighted by that profile's depth multipliers — so a Data/ML profile scores higher on import-rich codebases than on auth-heavy ones
  4. The selected sub-agent and the meta-agent update online
  5. Policy persists to ~/.saar/rl/ for the next run

Offline evaluation

python experiments/train_ucb.py        # 500 episodes, saves learning curve
python experiments/train_reinforce.py  # 500 episodes, saves baseline curve
python experiments/eval_comparison.py  # 95% bootstrap CI + Welch t-test

Results: UCB and REINFORCE each achieve ≥50% oracle-optimal vs 10% random (p < 0.05, Welch t-test). The Ensemble reaches the highest mean reward by dynamically routing between them.


Why I built this

I'm Devanshu, MS Software Engineering at Northeastern, solo founder building this in the open.

I got tired of AI tools that sounded smart but didn't know my project. Every session: wrong package manager, wrong exception class, wrong import. The fix was obvious: give the AI a context file. The problem was nobody maintained those files, they went stale, and most were full of generic filler the AI already knew.

So I built saar. It generates the file, keeps it short, tells you when it's stale, and lints it for quality. Runs locally. Costs nothing. Works with every AI tool you already use.

The code is all here. The benchmarks are all here. Nothing hidden.


Community


License

MIT. Free forever. Do whatever you want with it.


getsaar.com  ·  PyPI  ·  MIT License

If saar saved you time, a star on GitHub helps others find it.

About

Extract the essence of your codebase. Auto-generate CLAUDE.md, .cursorrules, and copilot-instructions.md from deep static analysis.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors