Skip to content

NodeNestor/claude-knowledge-graph

Repository files navigation

claude-knowledge-graph

Long-term memory for Claude Code. A plugin that gives Claude persistent knowledge across sessions — entities, relationships, semantic search, keyword search, and automatic knowledge extraction.

Rolling context = short-term memory (compression). This = long-term memory (knowledge graph).

Install

Option 1: Claude Code Plugin (recommended)

/plugin marketplace add https://github.com/NodeNestor/nestor-plugins
/plugin install claude-knowledge-graph

Option 2: Manual install

git clone https://github.com/NodeNestor/claude-knowledge-graph.git
cd claude-knowledge-graph

# macOS / Linux
chmod +x install.sh && ./install.sh

# Windows (PowerShell)
powershell -ExecutionPolicy Bypass -File install.ps1

Requires Python 3.10+. Installs 2 pip packages (fastembed, sqlite-vec) into an isolated venv (~253MB).

What it does

Once installed, the plugin runs automatically via 7 hook events:

Hook Event What happens
SessionStart New session opens Injects project entities, recent memories, and relationship context
UserPromptSubmit Every prompt you type Hybrid search (semantic + keyword) finds relevant memories and injects them. Buffers the prompt. Triggers periodic background extraction every ~20 prompts
PreToolUse Claude reads/searches files (Read, Glob, Grep) Hybrid search on the search query — injects relevant memories as additional context
PostToolUse Claude edits files (Edit, Write) Buffers the change (file path, old/new content) for later extraction
PostToolUse Claude runs commands (Bash) Buffers the command and output for later extraction
Stop Claude finishes responding LLM extraction — entities, relationships, and memories extracted from the session buffer
SubagentStart Subagent spawned Injects knowledge graph context (entities, memories) into the subagent
SessionEnd Session closes Cleanup — flushes remaining buffers

Under the hood

  • Hybrid search — every injection combines semantic similarity (fastembed cosine) with keyword matching (FTS5 BM25), merges and deduplicates results
  • 3-tier project awareness — exact project matches rank highest, then global memories, then cross-project. Composite scoring: similarity * 0.7 + importance * 0.2 + recency * 0.1
  • Event buffering — prompts, file reads, edits, and commands are logged to per-session buffer files. Extraction runs periodically (every ~20 prompts) in the background without blocking
  • Deduplication — before storing a new memory, checks cosine similarity against existing memories (threshold 0.92) to prevent duplicates
  • Async extraction — LLM extraction spawns as a background process so it never blocks your workflow

13 MCP Tools

Core Memory

Tool Description
remember Store a memory with optional entity links and importance (1-10)
recall Semantic search with 3-tier project awareness
keyword_search Full-text keyword search (BM25) with AND/OR/NOT support
forget Delete a memory by ID

Knowledge Graph

Tool Description
add_entity Create an entity (person, project, technology, concept, file, org, hardware)
add_relationship Link entities (uses, maintains, depends_on, works_with, contains, created, part_of)
get_entity Full entity detail with linked memories and relationships
search_entities Semantic entity search
graph_traverse Walk the graph from an entity with configurable depth

Extraction & System

Tool Description
extract LLM-powered extraction from text into entities, relationships, memories
stats Memory/entity/relationship counts and DB size
export Full graph as JSON
projects List all tracked projects with counts

How search works

Three search modes that work together:

  1. Semantic search — fastembed (all-MiniLM-L6-v2, 384-dim) + sqlite-vec KNN. Composite scoring: similarity * 0.7 + importance * 0.2 + recency * 0.1
  2. Keyword search — SQLite FTS5 with BM25 ranking. Supports AND, OR, NOT, and phrase queries
  3. Graph traversal — walk entity relationships to find connected knowledge

Hooks use hybrid search (semantic + keyword combined) to maximize recall.

LLM extraction

Knowledge extraction uses claude -p subprocess by default — it inherits your existing Claude Code login. No API key needed.

Fallback chain:

  1. claude -p subprocess (default, zero config)
  2. ANTHROPIC_API_KEY or KNOWLEDGE_GRAPH_LLM_API_KEY env var
  3. OpenAI-compatible local model (Ollama, LM Studio, vLLM)

Configuration

All optional environment variables:

Variable Default Description
KNOWLEDGE_GRAPH_DB ~/.claude/knowledge-graph/kg.db Database path
KNOWLEDGE_GRAPH_EXTRACT_INTERVAL 20 Prompts between extractions
KNOWLEDGE_GRAPH_LLM_PROVIDER auto claude, anthropic, openai-compatible
KNOWLEDGE_GRAPH_LLM_BASE_URL For local models
KNOWLEDGE_GRAPH_LLM_MODEL claude-haiku-4-5-20251001 Model for extraction
KNOWLEDGE_GRAPH_INJECT_MAX 5 Max memories injected per prompt
KNOWLEDGE_GRAPH_INJECT_MIN_SIMILARITY 0.3 Min cosine similarity threshold

Data

All data stored in ~/.claude/knowledge-graph/:

  • kg.db — SQLite database (memories, entities, relationships, embeddings)
  • buffers/ — session event buffers (temporary, auto-cleaned)

Uninstall

# macOS / Linux
./uninstall.sh

# Windows
powershell -ExecutionPolicy Bypass -File uninstall.ps1

Tech stack

  • 2 pip deps: fastembed (ONNX embeddings), sqlite-vec (vector search)
  • Hand-rolled MCP server (~80 lines, no framework)
  • SQLite for everything: vectors, FTS5, graph, metadata
  • claude -p for LLM calls (zero config auth)

About

Long-term memory for Claude Code — knowledge graph with semantic search, keyword search, and auto-extraction

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors