─────────────────────────────────────────────────────

Domain-Specific AI Research Assistant

with Knowledge Graph — README

─────────────────────────────────────────────────────

🔬 What This Project Does

An end-to-end, fully local, 100% free AI research assistant that:

Ingests academic PDFs
Retrieves relevant content using Hybrid RAG (Dense + Sparse + RRF)
Generates answers using a local LLM (Ollama)
Verifies every claim for hallucinations using NLI (bart-large-mnli)
Builds a multi-paper knowledge graph (NetworkX + Pyvis)

📁 Project Structure

PROJECT/
├── app.py                  # Main Streamlit UI
├── config.py               # All configuration (models, thresholds, paths)
├── requirements.txt        # Python dependencies
├── README.md               # This file
├── src/
│   ├── pdf_processor.py    # Section-aware PDF extraction (PyMuPDF)
│   ├── chunker.py          # Semantic chunking with overlap
│   ├── retriever.py        # Hybrid RAG: ChromaDB + BM25 + RRF
│   ├── llm.py              # Ollama LLM generation
│   ├── nli_verifier.py     # Hallucination detection (bart-large-mnli)
│   ├── knowledge_graph.py  # NetworkX + Pyvis entity graph
│   └── utils.py            # File helpers, formatters
└── data/
    ├── uploads/            # Uploaded PDFs stored here
    └── chroma_db/          # ChromaDB vector store (auto-created)

⚙️ Setup & Installation

Step 1 — Install Python dependencies

cd C:\Users\adhit\Downloads\PROJECT
pip install -r requirements.txt

Step 2 — Install Ollama

Download from: https://ollama.com

Then pull a model:

ollama pull llama3

Step 3 — Start Ollama server

ollama serve

Step 4 — Run the app

streamlit run app.py

🧠 Model Stack (All Free & Local)

Component	Model / Tool
PDF parsing	PyMuPDF
Embeddings	sentence-transformers/all-MiniLM-L6-v2
Vector DB	ChromaDB
Sparse search	rank-bm25
LLM generation	Ollama (llama3 / mistral / gemma2)
NLI verification	facebook/bart-large-mnli
Knowledge graph	NetworkX + Pyvis
UI	Streamlit

🔄 Switching LLM Models

Edit config.py:

OLLAMA_MODEL = "mistral"   # or gemma2, phi3, llama3

🎨 Answer Color Codes

Color	Meaning
🟢 Green	Grounded — directly supported by source
🟡 Yellow	Inferred — partially supported
🔴 Red	Hallucinated — not found in source papers

📝 Notes

First run will download NLI model (~1.6GB) automatically
ChromaDB persists between sessions in data/chroma_db/
Upload multiple PDFs — the graph links entities across all of them

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

─────────────────────────────────────────────────────

Domain-Specific AI Research Assistant

with Knowledge Graph — README

─────────────────────────────────────────────────────

🔬 What This Project Does

📁 Project Structure

⚙️ Setup & Installation

Step 1 — Install Python dependencies

Step 2 — Install Ollama

Step 3 — Start Ollama server

Step 4 — Run the app

🧠 Model Stack (All Free & Local)

🔄 Switching LLM Models

🎨 Answer Color Codes

📝 Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data		data
lib		lib
src		src
.env.example		.env.example
README.md		README.md
app.py		app.py
config.py		config.py
requirements.txt		requirements.txt
verify_setup.py		verify_setup.py

Folders and files

Latest commit

History

Repository files navigation

─────────────────────────────────────────────────────

Domain-Specific AI Research Assistant

with Knowledge Graph — README

─────────────────────────────────────────────────────

🔬 What This Project Does

📁 Project Structure

⚙️ Setup & Installation

Step 1 — Install Python dependencies

Step 2 — Install Ollama

Step 3 — Start Ollama server

Step 4 — Run the app

🧠 Model Stack (All Free & Local)

🔄 Switching LLM Models

🎨 Answer Color Codes

📝 Notes

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages