A Retrieval‑Augmented Generation (RAG) application that allows users to upload resumes (PDFs), index them into a vector database, and ask natural‑language questions about their content through an interactive Gradio web interface.
This project demonstrates an end‑to‑end RAG pipeline combining document ingestion, embeddings, vector search (FAISS), and a Large Language Model (LLM) hosted on Hugging Face.
Traditional resume screening systems rely on keyword matching and rigid rules. This project uses semantic search + LLM reasoning to:
- Understand resumes beyond keywords
- Support conversational follow‑up questions
- Scale to multiple resumes using vector search
The system is designed as a production‑style RAG application, not just a notebook experiment.
PDF Resume(s)
↓
Document Loader (PyPDF)
↓
Text Chunking
↓
Embeddings (Sentence Transformers)
↓
FAISS Vector Store
↓
Retriever (Top‑K Chunks)
↓
Prompt Construction
↓
LLM (Mistral‑7B via Hugging Face)
↓
Answer (Streaming via Gradio UI)
- PDF Upload & Indexing
- Semantic Resume Search (FAISS)
- Conversational Q&A (Chat Interface)
- Context‑Aware Responses (RAG)
- Streaming LLM Responses
- Notebook for Experimentation + Python App for Production
Below is a screenshot of the running application showing the resume screening analysis output after providing job requirements. The interface allows users to enter job criteria, analyze multiple resumes, and view detailed, explainable decisions generated by the RAG pipeline.
RAG-Powered-Resume-Screening-Assistant/
│
├── rag-testing.ipynb # Notebook: experimentation, testing, and pipeline validation
├── app.py # Gradio web application (contains full RAG logic)
├── vectorstore/ # Saved FAISS index (generated at runtime)
│ ├── index.faiss
│ └── index.pkl
├── requirements.txt # Python dependencies
└── README.md # Project documentation
The notebook is used for research, experimentation, and validation of the RAG pipeline before integrating it into the application.
It includes:
- Loading and parsing resume PDFs
- Chunking text with overlap
- Generating embeddings
- Building and querying a FAISS vector store
- Testing prompts and model responses
This notebook serves as the development and learning environment for the project.
All production logic is consolidated into a single Python file for simplicity and clarity.
Responsibilities handled inside app.py:
- PDF upload handling
- Text chunking and embedding generation
- FAISS vector store creation and loading
- Semantic retrieval (top‑K relevant chunks)
- Prompt construction using retrieved context
- LLM invocation via Hugging Face Endpoint
- Streaming responses to the UI
This design reflects a self‑contained RAG application, suitable for demos and portfolio projects.
- Resume screening & analysis
- Candidate shortlisting
- HR knowledge assistants
- Internal knowledge bases
- Prevents hallucination
- Grounds answers in real documents
- Scales to large document collections
- Industry‑standard architecture for LLM apps
- Python 3.10+
- LangChain
- FAISS
- Sentence Transformers
- Hugging Face Inference API
- Gradio
- PyPDF
This project currently does not support uploading resume PDFs directly through the web interface.
- Resume PDFs must be available locally on the machine running the application.
- The application indexes resumes from the local file system as part of the RAG pipeline.
- This design choice keeps the project simple and focused on demonstrating core RAG concepts rather than file-storage infrastructure.
🔧 Future improvement: Add web-based resume upload and persistent storage for fully remote usage.
- Uses remote LLM (internet required)
- Vectorstore stored locally (single‑machine)
- No user authentication (single session)
This project is for educational and portfolio purposes.
- LangChain
- Hugging Face
- FAISS
- Gradio