🤖 RAG AI Chatbot — Powered by Gemini & MongoDB Atlas

A full-stack Retrieval-Augmented Generation (RAG) system that lets you upload your own knowledge base and chat with it using Google Gemini. Built with LangChain, MongoDB Atlas Vector Search, and Streamlit.

Screenshots

💬 Chatbot Homepage

🗄️ MongoDB Vector Data

📊 Semantic Distance Map (Vector Visualization)

Features

Vector Store — MongoDB Atlas Vector Search stores and retrieves document embeddings
Embeddings — Powered by sentence-transformers/all-mpnet-base-v2 (768 dimensions)
LLM — Google Gemini gemini-2.5-flash for natural language responses
Vector Visualization — PCA-based 2D semantic distance map using Plotly
Framework — Built with LangChain + Streamlit

Project Structure

rag_template/
├── .streamlit/
│   └── secrets.toml        # API keys (never commit this!)
├── assets/
│   ├── chatbot_homepage.png
│   ├── mongo_db_data.png
│   └── semantic_distance_map.png
├── pages/
│   └── vector_graph.py     # Vector visualization page
├── backend.py              # Core RAG logic
├── home.py                 # Streamlit chat UI
└── requirements.txt

Prerequisites

Python 3.8+
A MongoDB Atlas cluster with Vector Search enabled
A Google AI Studio API key (Gemini)

Installation

1. Clone the repository:

git clone <your-repo-url>
cd rag_template

2. Create and activate a virtual environment:

python -m venv venv
# Windows
venv\Scripts\activate
# macOS/Linux
source venv/bin/activate

3. Install dependencies:

pip install -r requirements.txt

Configuration

Create a .streamlit/secrets.toml file in the project root:

MONGO_URI = "mongodb+srv://<username>:<password>@<cluster>.mongodb.net/?retryWrites=true&w=majority"
GEMINI_API_KEY = "your-gemini-api-key"

⚠️ Never commit secrets.toml to GitHub. Make sure .gitignore includes .streamlit/secrets.toml

MongoDB Atlas Setup

Create a collection: vector_store_database.embeddings_stream
Create a Vector Search Index named vector_index with this definition:

{
  "fields": [
    {
      "numDimensions": 768,
      "path": "embedding",
      "similarity": "cosine",
      "type": "vector"
    }
  ]
}

How It Works

User Input Text
      ↓
HuggingFace Embeddings (768-dim vectors)
      ↓
MongoDB Atlas Vector Store
      ↓
User Query → Similarity Search (top 3 docs)
      ↓
Context + Query → Gemini LLM
      ↓
Answer + Sources

Key Functions (`backend.py`)

Function	Description
`get_vector_store()`	Connects to MongoDB and loads the embedding model
`ingest_text(text)`	Converts text to vector and stores in MongoDB
`get_rag_response(query)`	Retrieves top 3 similar docs and generates a Gemini answer
`get_vectors_for_visualization(query)`	Returns vectors for PCA plotting

Run the App

streamlit run home.py

Open http://localhost:8501 in your browser.

Paste your knowledge in the sidebar → click Upload to MongoDB
Ask questions in the chat input
Visit the Vector Visualization page to explore semantic similarity

Tech Stack

Tool	Purpose
Streamlit	Frontend UI
LangChain	RAG pipeline orchestration
MongoDB Atlas	Vector store
Google Gemini	Language model
HuggingFace	Sentence embeddings
Plotly + PCA	Vector visualization

📄 License

MIT License

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🤖 RAG AI Chatbot — Powered by Gemini & MongoDB Atlas

Screenshots

💬 Chatbot Homepage

🗄️ MongoDB Vector Data

📊 Semantic Distance Map (Vector Visualization)

Features

Project Structure

Prerequisites

Installation

Configuration

MongoDB Atlas Setup

How It Works

Key Functions (`backend.py`)

Run the App

Tech Stack

📄 License

About

Releases

Packages

Contributors

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.vscode		.vscode
assets		assets
pages		pages
.gitignore		.gitignore
README.md		README.md
backend.py		backend.py
home.py		home.py
requirements.txt		requirements.txt
ui_components.py		ui_components.py

Folders and files

Latest commit

History

Repository files navigation

🤖 RAG AI Chatbot — Powered by Gemini & MongoDB Atlas

Screenshots

💬 Chatbot Homepage

🗄️ MongoDB Vector Data

📊 Semantic Distance Map (Vector Visualization)

Features

Project Structure

Prerequisites

Installation

Configuration

MongoDB Atlas Setup

How It Works

Key Functions (backend.py)

Run the App

Tech Stack

📄 License

About

Resources

Stars

Watchers

Forks

Releases

Packages

Contributors

Languages

Key Functions (`backend.py`)