diff --git a/skills/memory-protocol/README.md b/skills/memory-protocol/README.md new file mode 100644 index 0000000..2f66078 --- /dev/null +++ b/skills/memory-protocol/README.md @@ -0,0 +1,57 @@ +# Memory Protocol Skill + +Gives any AI agent a persistent, structured memory using a single Markdown file (`agent-memory.md`). Defines when to read memory, how to write new entries, rules for keeping it clean, and a `/dream` consolidation procedure for when the file grows too large. + +## Purpose + +Long-running agents lose context between sessions. This skill solves that by: + +- Reading memory before answering domain-specific questions +- Writing structured entries after significant tasks +- Replacing stale entries instead of appending duplicates +- Consolidating the file when it exceeds 200 lines via `/dream` + +## When to Use This Skill + +Activate when the agent: + +- Receives a thematic question ("how does X work in this project?") +- Completes a significant task (new feature, debug session, config change) +- Is asked to remember something +- Encounters `/dream` command + +**Activation keywords**: "remember", "what do you know about", "update memory", "/dream", any domain-specific question the agent might already know the answer to + +## Memory File Location + +Default path (adapt to your vault/project structure): +``` +/data/obsidian-vault/ProjectName/agent-memory.md +``` + +## Entry Format + +```markdown +### YYYY-MM-DD - Title +- Scope: user | project | session | team +- Content: ... +- Source: observation +- Expires: permanent +``` + +## The /dream Command + +When `agent-memory.md` exceeds **200 lines**: + +1. Read the full file +2. Group entries by topic, remove outdated duplicates +3. Save a snapshot of the old version to `Memory/dream-YYYY-MM-DD.md` +4. Write a condensed `agent-memory.md` (target: 50–70 lines) + +This prevents context window bloat without losing historical data. + +## Rules + +- **Replace, don't append**: If a newer fact supersedes an old entry on the same topic, edit the file +- **Never store**: API keys, tokens, passwords, PII +- **Be concrete**: Entries should be actionable in a future session, not vague notes diff --git a/skills/memory-protocol/SKILL.md b/skills/memory-protocol/SKILL.md new file mode 100644 index 0000000..b79db9f --- /dev/null +++ b/skills/memory-protocol/SKILL.md @@ -0,0 +1,63 @@ +--- +name: memory-protocol +description: | + Use this skill before thematic answers and after significant tasks. + Defines how to read, write, and consolidate persistent agent memory + stored in a structured Markdown file. +triggers: + - remember + - what do you know about + - update memory + - /dream +--- + +# Memory Protocol + +## Read Before Answering + +Before any thematic answer — check if the topic is already known: + +```bash +cat /data/obsidian-vault/ProjectName/agent-memory.md +``` + +For deep search across the full vault: + +```bash +grep -ril "TOPIC" /data/obsidian-vault/ --include="*.md" +``` + +--- + +## Write After Significant Tasks + +After any significant task — update `agent-memory.md` with new facts. + +**Entry format:** + +```markdown +### YYYY-MM-DD - Title +- Scope: user | project | session | team +- Content: ... +- Source: observation +- Expires: permanent +``` + +--- + +## Rules + +- **Replace stale entries** on the same topic — don't accumulate duplicates +- **Never save**: API keys, tokens, passwords, PII +- Entries must be concrete and actionable in a future session + +--- + +## /dream — Consolidation + +Run when `agent-memory.md` exceeds **200 lines**: + +1. Read the full `agent-memory.md` +2. Group entries by topic, remove outdated and duplicate facts +3. Save snapshot of the old version → `Memory/dream-YYYY-MM-DD.md` +4. Write condensed `agent-memory.md` (target: 50–70 lines) diff --git a/skills/obsidian-vault/README.md b/skills/obsidian-vault/README.md new file mode 100644 index 0000000..e80763b --- /dev/null +++ b/skills/obsidian-vault/README.md @@ -0,0 +1,67 @@ +# Obsidian Vault Skill + +A two-layer knowledge architecture for Obsidian vaults used with AI agents. Enforces strict separation between raw input data and synthesized Wiki notes, preventing agents from accidentally writing to the semantic layer. + +## Purpose + +AI agents working with Obsidian tend to mix raw data and synthesized knowledge. This skill establishes: + +- **Raw Layer** (`Reports/`, `Тренды/`, `Sessions/`, `Knowledge/`, `Inbox/`) — source material with `type: raw_material` +- **Wiki Layer** (`Wiki/`) — atomic synthesis notes with `type: summary`, written only by a dedicated Wiki Compiler step + +## When to Use This Skill + +Activate when the agent needs to: + +- Search for information in the vault +- Save a note or report to the vault +- Process files from an Inbox folder +- Understand the vault architecture before writing any files + +**Activation keywords**: "save to obsidian", "find in notes", "search the vault", "add to vault", "organize notes" + +## Architecture + +``` +vault/ +├── Wiki/ ← SEMANTIC LAYER — synthesis notes only. Read-only for most agents. +├── Reports/ ← Raw Layer: reports, articles, research +├── Тренды/ ← Raw Layer: trend analysis, YouTube transcripts +├── Sessions/ ← Raw Layer: session notes, AI chat logs +├── Knowledge/ ← Raw Layer: instructions, guides, setup docs +├── Inbox/ ← Raw Layer: incoming unprocessed files +├── Memory/ ← Agent memory snapshots +├── _INDEX.md ← Master index, update after every file change +└── _TEMPLATE.md ← Frontmatter template + Master Tag Dictionary +``` + +## Three-Stage Pipeline + +For cron or manual vault maintenance: + +| Stage | File | Scope | +|-------|------|-------| +| 1. Librarian | `agents/obsidian-librarian.md` | All folders except `Wiki/` — frontmatter, tags, routing | +| 2. Wiki Compiler | `agents/compiler-agent.md` | `Wiki/` — synthesize raw data into Sapling notes | +| 3. Semantic Linker | `src/semantic_linker.ts` | `Wiki/` only — horizontal semantic links | + +## Frontmatter Format (Raw Layer) + +```yaml +--- +title: "Human readable title" +date: YYYY-MM-DD +category: cosmetology | psychology | business | development +tags: [tag1, tag2, tag_with_underscore] +type: raw_material +semantic_weights: +--- +``` + +## Key Rules + +- Wiki Layer is **read-only** for all agents except Wiki Compiler +- Only `type: raw_material` is written to Raw Layer +- Horizontal links exist **only** within Wiki Layer +- Raw files are isolated — no cross-links between Raw files +- Every file gets frontmatter; update `_INDEX.md` after any change diff --git a/skills/obsidian-vault/SKILL.md b/skills/obsidian-vault/SKILL.md new file mode 100644 index 0000000..601d012 --- /dev/null +++ b/skills/obsidian-vault/SKILL.md @@ -0,0 +1,109 @@ +--- +name: obsidian-vault +description: | + Use this skill before any read or write operation on an Obsidian vault. + Defines the two-layer architecture (Raw vs Wiki), frontmatter rules, + search strategy, and the three-stage pipeline for vault maintenance. +triggers: + - save to obsidian + - find in notes + - search the vault + - add to vault + - organize notes + - update index +--- + +# Obsidian Vault Architecture + +## Two Layers — Strict Separation + +### Raw Layer (your write zone) + +Folders: `Inbox/`, `Reports/`, `Тренды/`, `Sessions/`, `Knowledge/` + +- All files must have `type: raw_material` in frontmatter +- No horizontal cross-links between Raw files +- Source of truth for input data + +### Wiki Layer (read-only for most agents) + +Folder: `Wiki/` + +- Contains atomic synthesis notes (`type: summary`) +- The semantic knowledge base — always search here first +- Written **only** by the Wiki Compiler agent (`/compile_wiki`) +- **NEVER write to `Wiki/` directly** — if asked, redirect to Wiki Compiler + +--- + +## Search Strategy + +Always search Wiki first, then Raw if needed: + +```bash +# Priority 1 — semantic layer +grep -ril "TOPIC" /data/obsidian-vault/ProjectName/Wiki --include="*.md" + +# Priority 2 — raw sources (only if Wiki insufficient) +grep -ril "TOPIC" /data/obsidian-vault/ProjectName/Reports --include="*.md" +``` + +--- + +## Frontmatter (Raw Layer) + +Every Raw file must have: + +```yaml +--- +title: "Human readable title" +date: YYYY-MM-DD +category: cosmetology | psychology | business | development +tags: [tag1, tag2, tag_with_underscore] +type: raw_material +semantic_weights: +--- +``` + +**Tag rules:** +- Use underscores, no spaces: `skin_care` not `skin care` +- No pure numbers: `year2026` not `2026` +- 3–7 tags per file +- Check `_TEMPLATE.md` Master Tag Dictionary before creating new tags +- Add new tags to `_TEMPLATE.md` when you create them + +--- + +## Three-Stage Pipeline + +| Stage | Responsible | Scope | +|-------|------------|-------| +| 1. Librarian | `agents/obsidian-librarian.md` | Inbox/ + Raw folders: frontmatter, tags, routing | +| 2. Wiki Compiler | `agents/compiler-agent.md` | Wiki/: synthesize → Sapling notes | +| 3. Semantic Linker | `src/semantic_linker.ts` | Wiki/ only: horizontal semantic links | + +**Rule**: each stage works only in its own zone. + +--- + +## File Routing (from Inbox) + +- Session notes, chat logs → `Sessions/` +- YouTube research, media transcripts → `Тренды/` +- Reports, analysis, articles → `Reports/` +- Instructions, guides, setup docs → `Knowledge/` + +After routing, update `_INDEX.md`: +``` +- [[FileName]] — brief description (date, main tags) +``` + +--- + +## Absolute Prohibitions + +- ❌ Never create files in `Wiki/` — that's Wiki Compiler's job +- ❌ Never use `type: summary` — only `raw_material` +- ❌ Never add cross-links between Raw files +- ❌ Never modify tags of already-tagged files +- ❌ Never delete files — move to `Archive/` instead diff --git a/skills/search-retrieval/README.md b/skills/search-retrieval/README.md new file mode 100644 index 0000000..3fd6230 --- /dev/null +++ b/skills/search-retrieval/README.md @@ -0,0 +1,60 @@ +# Search & Data Retrieval Skill + +Intelligent decision tree for selecting the right search or data retrieval tool. Prevents common mistakes like using Firecrawl for general search (wastes quota) or Google Grounding to find YouTube links (it can't return them). + +## Purpose + +This skill teaches the agent a **4-tool decision tree** for any external data task: + +- **Google Grounding** — fast facts, news synthesis, date/price checks +- **SearXNG** — structured search with URLs and snippets (YouTube, news, science, images) +- **Gemini fetch** — read a known URL (articles, docs, blogs) +- **Firecrawl** — scrape JS/SPA sites or extract structured tables (use sparingly) + +## When to Use This Skill + +Activate when the agent needs to: + +- Answer a factual question about current events +- Find YouTube videos on a topic +- Search for articles or papers with source links +- Read the content of a specific URL +- Scrape a dynamic site or extract table data + +**Activation keywords**: "search", "find", "look up", "open the link", "read the article", "what's on [URL]", "latest news about", "YouTube videos on" + +## Key Rules + +### Tool Selection Matrix + +| Task | Tool | +|------|------| +| Facts, news, definitions | Google Grounding | +| YouTube links & trends | SearXNG (category: videos) | +| Topic search with URLs | SearXNG (category: general) | +| Scientific papers | SearXNG (category: science) | +| Read a specific URL | Gemini fetch | +| JS/SPA site or tables | Firecrawl | + +### Critical HTTP Rule + +Always use Python `requests` with `json=` parameter for HTTP calls — never `curl` for requests with text bodies. This prevents Cyrillic and Unicode encoding issues. + +### Firecrawl Conservation + +Firecrawl has a 500-request limit. Use it **only** when Gemini fetch returns empty results or the target is a JavaScript-rendered page. + +## Trend Monitoring Combo + +For multi-step trend research: +1. SearXNG `videos` + `youtube` engine → get URL list +2. Gemini fetch → read descriptions/transcripts +3. Firecrawl → only if fetch fails +4. Google Grounding → fact-check and add context + +## Anti-Hallucination Rules + +- **Direct links only** — never provide encoded redirect URLs +- **YouTube format** — always `watch?v=XXXXXXXXXXX` (11-char ID) +- **No fabrication** — if no direct link found, say "Link not found" +- **Data freshness** — for 2024–2026 trends, rely on live search results, not internal knowledge diff --git a/skills/search-retrieval/SKILL.md b/skills/search-retrieval/SKILL.md new file mode 100644 index 0000000..1ae3371 --- /dev/null +++ b/skills/search-retrieval/SKILL.md @@ -0,0 +1,146 @@ +--- +name: search-retrieval +description: | + Use this skill for any search, URL fetch, or external data retrieval task. + Determines the correct tool (Google Grounding, SearXNG, Gemini fetch, Firecrawl) + based on a decision tree, with anti-hallucination rules for links. +triggers: + - search + - find + - look up + - open the link + - read the article + - latest news + - YouTube videos on +--- + +# Search & Data Retrieval + +## HTTP Requests Rule (CRITICAL) + +- **ALWAYS** use Python `requests` with `json=` parameter for ALL HTTP calls +- **NEVER** use `curl` for requests with text bodies — breaks Unicode/Cyrillic encoding + +```python +import requests, os +response = requests.post(url, headers={...}, json={...}, timeout=30) +``` + +--- + +## Decision Tree — Choose the Right Tool + +### 1. Google Grounding (built-in) — first choice for facts + +- Fast facts, definitions, general questions +- Current news (synthesizes answer with sources) +- Verifying information, prices, release dates +- ⚠️ Does NOT return YouTube links +- ⚠️ No filters by time or domain + +### 2. SearXNG — structured search with links + +- Endpoint: `POST $SEARXNG_WEBHOOK_URL` +- Headers: `{"X-API-Key": "$SEARXNG_API_KEY", "Content-Type": "application/json"}` +- Returns list of URLs + snippets, NOT a synthesized answer + +```python +import requests, os +response = requests.post( + os.getenv("SEARXNG_WEBHOOK_URL"), + headers={ + "X-API-Key": os.getenv("SEARXNG_API_KEY"), + "Content-Type": "application/json" + }, + json={ + "query": "YOUR QUERY here", + "category": "general", + "time_range": "", + "engines": "" + }, + timeout=30 +) +results = response.json() +``` + +| Task | category | engines | time_range | +|------|----------|---------|------------| +| YouTube trends/videos | videos | youtube | month | +| Broad video search | videos | (empty) | month | +| News for a period | news | google news,bing news | week | +| Scientific papers | science | google scholar,semantic scholar,pubmed | (empty) | +| IT documentation | it | (empty) | (empty) | +| Images | images | google images,bing images | (empty) | +| General search (default) | general | (empty) | (empty) | + +### 3. Gemini fetch (built-in) — read a specific URL + +- Use when you have a concrete URL and need to read its content +- Suitable for articles, blogs, docs, regular websites +- Free, unlimited — always try BEFORE Firecrawl +- Triggers: "read the article", "open the link", "what's written on [URL]" + +### 4. Firecrawl — only for complex scraping of a specific URL + +- Endpoint: `POST https://api.firecrawl.dev/v1/scrape` +- Headers: `{"Authorization": "Bearer $FIRECRAWL_API_KEY"}` +- Body: `{"url": "TARGET_URL", "formats": ["markdown"]}` +- ⚠️ Limit: 500 requests — use sparingly +- Use ONLY when: + - Gemini fetch returned empty or incomplete result + - JavaScript/SPA site (dynamic content) + - Need tables and structured data + +```python +import requests, os + +response = requests.post( + "https://api.firecrawl.dev/v1/scrape", + headers={"Authorization": f"Bearer {os.getenv('FIRECRAWL_API_KEY')}"}, + json={"url": url, "formats": ["markdown"]}, + timeout=30 +) +content = response.json().get("data", {}).get("markdown", "") +``` + +--- + +## Tool Selection Matrix + +| Task | Tool | +|------|------| +| Facts, news, definitions | Google Grounding | +| YouTube links & trends | SearXNG (videos) | +| Topic search with links | SearXNG (general) | +| Scientific papers | SearXNG (science) | +| Read a specific URL | Gemini fetch | +| JS/SPA site or tables | Firecrawl | + +--- + +## Trend Monitoring Combo + +1. SearXNG (category: videos, engines: youtube, time_range: month) → URL list +2. Gemini fetch → read descriptions/transcripts +3. Firecrawl → only if fetch failed +4. Google Grounding → verify facts and context + +--- + +## Forbidden + +- `curl` for requests with text (breaks encoding) +- Firecrawl for general topic search (wastes quota) +- Firecrawl instead of Gemini fetch for regular articles +- Google Grounding to find YouTube links (doesn't return them) +- Parallel duplicate requests without a reason + +--- + +## Anti-Hallucination & Link Rules + +- **Direct Links Only**: Never provide long encoded redirect URLs +- **YouTube Format**: Always use `watch?v=XXXXXXXXXXX` (exactly 11-char ID) +- **No Fabrication**: If no direct link in results — say "Link not found" +- **Source Verification**: Prioritize URLs from live search results over internal knowledge +- **Data Freshness**: For recent events (2024–2026), rely exclusively on search results