📄 AI Based Document Search and Knowledge Retrieval with Conversational Interface

An AI-powered document search and conversational question-answering system that enables users to upload documents and interact with them through a chat-based interface. The system uses semantic search and retrieval-augmented generation (RAG) to ensure accurate, context-aware responses strictly based on the uploaded content.

🚀 Features

Upload and process multiple PDF and TXT documents
Semantic search using vector embeddings
Conversational question-answering interface
Answers restricted strictly to document context
OCR fallback for scanned or image-based PDFs
Modern dark UI with smooth hover effects
Fast and efficient vector search using ChromaDB
Modular and scalable architecture

🧠 Technology Stack

Frontend: Streamlit
LLM: Meta LLaMA 3.2 Instruct (Hugging Face)
Embeddings: sentence-transformers/all-MiniLM-L6-v2
Vector Database: ChromaDB
Framework: LangChain
PDF Processing: PyPDF2, Unstructured

⚙️ Installation

Clone the Repository

git clone https://github.com/your-username/your-repo-name.git
cd your-repo-name

🧪 How It Works

User uploads PDF or TXT documents
Text is extracted using PyPDF2 (with OCR fallback for scanned PDFs)
Text is split into smaller chunks
Vector embeddings are generated using MiniLM
Embeddings are stored in ChromaDB
Relevant chunks are retrieved for each user query
The LLaMA model generates a response strictly from the retrieved context

💬 Usage

Upload one or more documents
Click Process Documents
Ask questions using the chat input
Receive accurate answers based only on document content

If the answer is not found in the documents, the system responds with:

"I don't know"

🌐 Deployment (Streamlit Cloud)

Push the project to GitHub
Go to https://streamlit.io/cloud
Connect your GitHub repository
Add HF_TOKEN under Secrets
Deploy the application

🎯 Use Cases

Academic research and study assistance
Knowledge-base chatbots
Document and report analysis
Legal and policy document exploration
Resume and portfolio document querying

🖼️ Application Screenshot

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
Agile Documents		Agile Documents
README.md		README.md
app.py		app.py
backend.py		backend.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📄 AI Based Document Search and Knowledge Retrieval with Conversational Interface

🚀 Features

🧠 Technology Stack

⚙️ Installation

Clone the Repository

🧪 How It Works

💬 Usage

🌐 Deployment (Streamlit Cloud)

🎯 Use Cases

🖼️ Application Screenshot

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

📄 AI Based Document Search and Knowledge Retrieval with Conversational Interface

🚀 Features

🧠 Technology Stack

⚙️ Installation

Clone the Repository

🧪 How It Works

💬 Usage

🌐 Deployment (Streamlit Cloud)

🎯 Use Cases

🖼️ Application Screenshot

About

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages