A small-scale PDF-based Retrieval-Augmented Generation (RAG) system designed for correctness, clear separation of concerns, and practical efficiency.
- Persistent Vector Storage: Uses Pinecone for scalable vector storage
- Incremental Indexing: Selective reindexing for changed PDFs only
- Grounded Answer Generation: Pluggable LLM layer with answer grounding
- Local Caching: SQLite-backed caching to avoid unnecessary LLM calls
- CLI Interface: Command-line tools for indexing and querying
- FastAPI API: HTTP endpoints for programmatic access
- Analytics: Query logging for internal analytics
The system follows a clean pipeline: PDF -> chunk -> embed -> Pinecone -> retrieve -> LLM
- Ingestion: Loads PDFs from
data/, cleans text, splits into chunks - Embedding: Uses
sentence-transformers/all-MiniLM-L6-v2for embeddings - Vector Storage: Pinecone for persistent vector storage
- Retrieval: Semantic search over embedded chunks
- Generation: LLM-powered answer generation with grounding
- Clone the repository:
git clone https://github.com/yourusername/agentic-rag.git
cd agentic-rag- Create a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies:
pip install -r requirements.txt- Set up environment variables:
cp .env.example .env
# Edit .env with your API keys (Pinecone, Google AI, etc.)Index documents:
python -m app.main indexAsk questions:
python -m app.main ask "What is the main topic of the documents?"Start the FastAPI server:
uvicorn app.api:app --reloadThe API will be available at http://localhost:8000
The system uses pydantic-settings for configuration. Key settings include:
- Pinecone API key and environment
- Google AI API key for LLM
- Embedding model configuration
- Chunk size and overlap settings
See app/config.py for all available options.
Run tests:
pytestapp/
├── core/ # Core RAG pipeline components
├── infra/ # Infrastructure (embedding, storage, etc.)
├── utils/ # Utilities (logging, caching, etc.)
├── api.py # FastAPI application
├── config.py # Configuration management
├── graph.py # LangGraph orchestration
└── main.py # CLI entrypoint
tests/ # Test suite
data/ # PDF documents directory
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
MIT License - see LICENSE file for details