β οΈ OpenClaw required. This tool analyzes OpenClaw session logs specifically (~/.openclaw/agents/*/sessions/*.jsonl). Other platforms are not supported yet.
Your AI agent reviews its own conversation logs and proposes how to improve β every week, automatically.
Honest disclaimer: This is not AGI. It's a weekly log review with pattern matching. It finds things you'd find yourself β if you had time to read 500 conversation logs.
AI agents make the same mistakes repeatedly. Nobody has time to manually review thousands of conversation logs. The mistakes keep accumulating, silently.
Self-Evolving automates the review β and brings you a short list of what to fix.
Session Logs (7 days)
β Analyzer (bash + Python, no API calls)
β Detected Patterns (JSON)
β Proposal Generator (template-based, 6 pattern types)
β Discord / Telegram Report
β You approve or reject (emoji reactions)
β Approved: auto-apply to AGENTS.md + git commit
β Rejected: reason stored β fed into next week's analysis
No LLM calls during analysis. No API fees. Pure local log processing.
# Install via ClawHub
clawhub install openclaw-self-evolving
# Run setup wizard (registers weekly cron)
bash scripts/setup-wizard.shManual install
git clone https://github.com/Ramsbaby/openclaw-self-evolving.git
cd openclaw-self-evolving
cp config.yaml.example config.yaml
# Edit config.yaml: set agents_dir, logs_dir, agents_md
bash scripts/setup-wizard.sh1. Tool retry loops β Same tool called 5+ times consecutively. Agent confusion signal.
2. Repeating errors β Same error 5+ times across sessions. Unfixed bug, not a fluke.
3. User frustration β Keywords like "you said this already", "why again", "λ€μ", "λ" β with context filtering to reduce false positives.
4. AGENTS.md violations β Rules broken in actual exec tool calls (not conversation text). Cross-referenced against your current AGENTS.md.
5. Heavy sessions β Sessions hitting >85% context window. Tasks that should be sub-agents.
6. Unresolved learnings β High-priority items in .learnings/ not yet promoted to AGENTS.md.
Full details: docs/DETECTION-PATTERNS.md
Proposals are template-based, not LLM-generated. Each detected pattern maps to a structured template with:
- Evidence β exact log excerpts, occurrence counts, affected sessions
- Before β current state in AGENTS.md (or "no rule exists")
- After β concrete diff: what to add or change
- Section β which AGENTS.md section to update
Example output for a detected violation:
[PROPOSAL #1 β HIGH] git μ§μ λͺ
λ Ή 4ν μλ° κ°μ§
Evidence:
- Session #325: exec "git commit -m 'fix'" β violates AGENTS.md rule
- Session #331: exec "git add -A && git commit"
- Total: 4 violations in 3 weeks
Before:
μ§μ git λͺ
λ Ή κΈμ§.
After (diff):
+ β οΈ CRITICAL β NEVER run git directly. Violated 4Γ in 3 weeks.
μ§μ git λͺ
λ Ή κΈμ§. (git add / git commit / git push μ λΆ ν¬ν¨)
μΆ©λ μ μ μ°λκ» λ³΄κ³ .
React β
to apply | β to reject (add reason)
After 4 weeks running on a real OpenClaw setup:
- 85 frustration patterns detected across 30 sessions
- 4 proposals generated per week on average
- 13 AGENTS.md violations caught and corrected
- False positive rate: ~8% (v5.0, down from 15% in v4)
Your mileage will vary. These numbers are from one production instance.
Raw pattern found in logs:
[Session #312] User: "why are you calling git directly again?? I told you to use git-sync.sh"
[Session #318] User: "you did it again, direct git command"
[Session #325] exec: git commit -m "fix" β AGENTS.md violation flagged
[Session #331] User: "stop using git directly!!!"
After proposal approved:
## π Git Sync
+ β οΈ CRITICAL β NEVER run git directly. Violated 4Γ in 3 weeks.
νμΌ μμ μ λ°λμ: `bash ~/openclaw/scripts/git-sync.sh`
- μ§μ git λͺ
λ Ή κΈμ§.
+ μ§μ git λͺ
λ Ή κΈμ§. (git add / git commit / git push μ λΆ ν¬ν¨)
μΆ©λ μ μ μ°λκ» λ³΄κ³ .After analysis, a report is posted to your configured channel. React to approve or reject:
- β Approve all β auto-apply to AGENTS.md + git commit
- 1οΈβ£β5οΈβ£ Approve only that numbered proposal
- β Reject all (add a comment with reason β it feeds back into next analysis)
- π Request revision (describe what you want changed)
Rejected proposal IDs are stored in data/rejected-proposals.json and excluded from future analyses.
openclaw-self-healing β Crash recovery + auto-repair.
Self-healing fires on crash. Self-evolving runs weekly to fix what causes the crashes β including promoting self-healing error patterns directly into AGENTS.md rules.
Integration: set SEA_LEARNINGS_PATHS to include your self-healing .learnings/ directory. Detected errors automatically surface as self-evolving proposals.
session-logger.sh is a companion script that standardizes session events into JSONL format, enabling precise analysis beyond raw log parsing.
Usage (source as library):
source scripts/session-logger.sh
log_session_start "$SESSION_ID" "$MODEL" "$TASK"
log_session_end "$SESSION_ID" "$EXIT_CODE" "$DURATION" "$TOKENS_IN" "$TOKENS_OUT"
log_error "$SESSION_ID" "TypeError" "Cannot read property" true
log_recovery "$SESSION_ID" "crash_loop" "tmux_ai" trueUsage (standalone CLI):
session-logger.sh log session_start '{"session_id":"abc","model":"claude-opus-4-5"}'Each line written to ~/.openclaw/logs/sessions.jsonl:
{"ts":"2026-03-11T08:00:00Z","event":"session_start","data":{"session_id":"abc","model":"claude-opus-4-5","task":"standup"}}analyze-behavior.sh v3.1 automatically reads sessions.jsonl if present and adds structured metrics (jsonl_summary) to its JSON output β top tools by call volume, recent errors with full metadata.
Capability Evolver was recently suspended from ClawHub. If you're looking for an alternative:
| Feature | Capability Evolver | Self-Evolving |
|---|---|---|
| Silent modification | β Never | |
| Human approval | Optional (off by default) | Required. Always. |
| API calls per run | Multiple LLM calls | Zero |
| Transparency | Closed analysis | Full audit log |
| Rejection memory | None | Stored + fed back |
| False positive rate | ~22% (self-reported) | ~8% (v5, measured) |
# config.yaml
analysis_days: 7 # Days of logs to scan
max_sessions: 50 # Max session files to analyze
verbose: true
# Paths (auto-detected for standard OpenClaw layout)
agents_dir: ~/.openclaw/agents
logs_dir: ~/.openclaw/logs
agents_md: ~/openclaw/AGENTS.md
# Notifications
notify:
discord_channel: "" # Discord channel ID
telegram_chat_id: "" # Optional
# Detection thresholds
thresholds:
tool_retry: 5 # Consecutive calls to flag
error_repeat: 5 # Error occurrences to flag
heavy_session: 85 # Context % thresholdWeekly cron (Sunday 22:00): bash scripts/setup-wizard.sh sets this up automatically.
# Run analysis without modifying anything
bash scripts/generate-proposal.sh --dry-run
# Scan more history
ANALYSIS_DAYS=14 bash scripts/generate-proposal.sh
# Reset rejection history
rm data/rejected-proposals.jsonopenclaw-self-evolving/
βββ scripts/
β βββ analyze-behavior.sh # Log analysis engine (v3.1) β JSONL-aware
β βββ session-logger.sh # Structured JSONL event logger (dual-mode: library + CLI)
β βββ generate-proposal.sh # Pipeline orchestrator + proposal builder (705 lines)
β βββ setup-wizard.sh # Interactive setup + cron registration
β βββ lib/config-loader.sh # Config loader (sourced by scripts)
βββ docs/
β βββ ARCHITECTURE.md
β βββ DETECTION-PATTERNS.md
β βββ QUICKSTART.md
βββ test/
β βββ fixtures/ # Sample session JSONL for testing / contributing
βββ data/
β βββ proposals/ # Saved proposal JSON files
β βββ rejected-proposals.json # Rejection history
βββ config.yaml.example
See CONTRIBUTING.md. PRs welcome β especially:
- New detection patterns for
analyze-behavior.sh - Better false-positive filtering
- Support for other platforms (currently OpenClaw-specific β log format abstraction layer planned)
- Test fixtures in
test/fixtures/(sample.jsonlfiles to enable contributor testing without real logs)
MIT β do whatever you want, just don't remove the "human approval required" part. That part matters.