Long-term memory for Claude Code. A plugin that gives Claude persistent knowledge across sessions — entities, relationships, semantic search, keyword search, and automatic knowledge extraction.
Rolling context = short-term memory (compression). This = long-term memory (knowledge graph).
/plugin marketplace add https://github.com/NodeNestor/nestor-plugins
/plugin install claude-knowledge-graph
git clone https://github.com/NodeNestor/claude-knowledge-graph.git
cd claude-knowledge-graph
# macOS / Linux
chmod +x install.sh && ./install.sh
# Windows (PowerShell)
powershell -ExecutionPolicy Bypass -File install.ps1Requires Python 3.10+. Installs 2 pip packages (fastembed, sqlite-vec) into an isolated venv (~253MB).
Once installed, the plugin runs automatically via 7 hook events:
| Hook | Event | What happens |
|---|---|---|
| SessionStart | New session opens | Injects project entities, recent memories, and relationship context |
| UserPromptSubmit | Every prompt you type | Hybrid search (semantic + keyword) finds relevant memories and injects them. Buffers the prompt. Triggers periodic background extraction every ~20 prompts |
| PreToolUse | Claude reads/searches files (Read, Glob, Grep) | Hybrid search on the search query — injects relevant memories as additional context |
| PostToolUse | Claude edits files (Edit, Write) | Buffers the change (file path, old/new content) for later extraction |
| PostToolUse | Claude runs commands (Bash) | Buffers the command and output for later extraction |
| Stop | Claude finishes responding | LLM extraction — entities, relationships, and memories extracted from the session buffer |
| SubagentStart | Subagent spawned | Injects knowledge graph context (entities, memories) into the subagent |
| SessionEnd | Session closes | Cleanup — flushes remaining buffers |
- Hybrid search — every injection combines semantic similarity (fastembed cosine) with keyword matching (FTS5 BM25), merges and deduplicates results
- 3-tier project awareness — exact project matches rank highest, then global memories, then cross-project. Composite scoring:
similarity * 0.7 + importance * 0.2 + recency * 0.1 - Event buffering — prompts, file reads, edits, and commands are logged to per-session buffer files. Extraction runs periodically (every ~20 prompts) in the background without blocking
- Deduplication — before storing a new memory, checks cosine similarity against existing memories (threshold 0.92) to prevent duplicates
- Async extraction — LLM extraction spawns as a background process so it never blocks your workflow
| Tool | Description |
|---|---|
remember |
Store a memory with optional entity links and importance (1-10) |
recall |
Semantic search with 3-tier project awareness |
keyword_search |
Full-text keyword search (BM25) with AND/OR/NOT support |
forget |
Delete a memory by ID |
| Tool | Description |
|---|---|
add_entity |
Create an entity (person, project, technology, concept, file, org, hardware) |
add_relationship |
Link entities (uses, maintains, depends_on, works_with, contains, created, part_of) |
get_entity |
Full entity detail with linked memories and relationships |
search_entities |
Semantic entity search |
graph_traverse |
Walk the graph from an entity with configurable depth |
| Tool | Description |
|---|---|
extract |
LLM-powered extraction from text into entities, relationships, memories |
stats |
Memory/entity/relationship counts and DB size |
export |
Full graph as JSON |
projects |
List all tracked projects with counts |
Three search modes that work together:
- Semantic search — fastembed (all-MiniLM-L6-v2, 384-dim) + sqlite-vec KNN. Composite scoring:
similarity * 0.7 + importance * 0.2 + recency * 0.1 - Keyword search — SQLite FTS5 with BM25 ranking. Supports
AND,OR,NOT, and phrase queries - Graph traversal — walk entity relationships to find connected knowledge
Hooks use hybrid search (semantic + keyword combined) to maximize recall.
Knowledge extraction uses claude -p subprocess by default — it inherits your existing Claude Code login. No API key needed.
Fallback chain:
claude -psubprocess (default, zero config)ANTHROPIC_API_KEYorKNOWLEDGE_GRAPH_LLM_API_KEYenv var- OpenAI-compatible local model (Ollama, LM Studio, vLLM)
All optional environment variables:
| Variable | Default | Description |
|---|---|---|
KNOWLEDGE_GRAPH_DB |
~/.claude/knowledge-graph/kg.db |
Database path |
KNOWLEDGE_GRAPH_EXTRACT_INTERVAL |
20 |
Prompts between extractions |
KNOWLEDGE_GRAPH_LLM_PROVIDER |
auto |
claude, anthropic, openai-compatible |
KNOWLEDGE_GRAPH_LLM_BASE_URL |
— | For local models |
KNOWLEDGE_GRAPH_LLM_MODEL |
claude-haiku-4-5-20251001 |
Model for extraction |
KNOWLEDGE_GRAPH_INJECT_MAX |
5 |
Max memories injected per prompt |
KNOWLEDGE_GRAPH_INJECT_MIN_SIMILARITY |
0.3 |
Min cosine similarity threshold |
All data stored in ~/.claude/knowledge-graph/:
kg.db— SQLite database (memories, entities, relationships, embeddings)buffers/— session event buffers (temporary, auto-cleaned)
# macOS / Linux
./uninstall.sh
# Windows
powershell -ExecutionPolicy Bypass -File uninstall.ps1- 2 pip deps: fastembed (ONNX embeddings), sqlite-vec (vector search)
- Hand-rolled MCP server (~80 lines, no framework)
- SQLite for everything: vectors, FTS5, graph, metadata
claude -pfor LLM calls (zero config auth)