Micro RAG

A minimalist command-line tool for Retrieval-Augmented Generation (RAG) chat with your documents using local LLMs via Ollama.

Features

Chat with your documents using RAG technology
Uses local LLMs via Ollama
Simple and intuitive command-line interface
Supports various document formats
Streaming responses with source citations

Installation

# Clone this repository
git clone https://github.com/yourusername/minimal-rag.git
cd minimal-rag

# Install dependencies
pip install -r requirements.txt

# Make the script executable
chmod +x micro_rag.py

Usage

Basic usage:

python micro_rag.py /path/to/documents

With custom models:

python micro_rag.py /path/to/documents --chat-model "llama3:latest" --embed-model "nomic-embed-text"

Full options:

python micro_rag.py --help

Example

                                                                                                                                                                       (micro-rag) 
Minimal RAG Chat Tool
Documents: data/wikipedia-ai/
Chat Model: olmo2:7b
Embedding Model: nomic-embed-text
--------------------------------------------------
Using Ollama at: http://localhost:11434
✔ Chat model olmo2:7b loaded successfully
✔ Embedding model nomic-embed-text loaded successfully
⠹ Loading documents from data/wikipedia-ai/ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/3 • 0:00:00
✓ Loaded 3 document(s)
  Building vector index ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0:00:05
✓ Vector index built successfully

RAG Chat initialized. Type 'exit' or 'quit' to end the session.


You: Who proposed the first artificial neuron?

AI:  Warren McCulloch and Walter Pitts proposed the first artificial neuron in their 1943 paper titled "A Logical Calculus of Ideas Immanent in Nervous Activity."

--------------------------------------------------

Command-line Arguments

documents_dir: Directory containing the documents to chat with (required)
--chat-model: Ollama chat model to use (default: "orca-mini:13b")
--embed-model: Ollama embedding model to use (default: "nomic-embed-text")
--ollama-host: Ollama host URL (default: http://localhost:11434 or OLLAMA_HOST env variable)
--chunk-size: Size of document chunks (default: 512)
--chunk-overlap: Overlap between document chunks (default: 50)

Requirements

Python 3.8+
Ollama running locally or remotely
Available models in Ollama for chat and embeddings

Notes

The first run will download the models if they aren't already available in Ollama
Type 'exit' or 'quit' to end the chat session
Press Ctrl+C to interrupt the chat

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data/wikipedia-ai		data/wikipedia-ai
img		img
tests		tests
.gitignore		.gitignore
README.md		README.md
micro_rag.py		micro_rag.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Micro RAG

Features

Installation

Usage

Example

Command-line Arguments

Requirements

Notes

License

About

Releases

Packages

Languages

clstaudt/micro-rag

Folders and files

Latest commit

History

Repository files navigation

Micro RAG

Features

Installation

Usage

Example

Command-line Arguments

Requirements

Notes

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages