Skip to content

nagham05/RAG-Powered-Resume-Screening-Assistant

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RAG‑Powered Resume Screening Assistant

A Retrieval‑Augmented Generation (RAG) application that allows users to upload resumes (PDFs), index them into a vector database, and ask natural‑language questions about their content through an interactive Gradio web interface.

This project demonstrates an end‑to‑end RAG pipeline combining document ingestion, embeddings, vector search (FAISS), and a Large Language Model (LLM) hosted on Hugging Face.


Project Overview

Traditional resume screening systems rely on keyword matching and rigid rules. This project uses semantic search + LLM reasoning to:

  • Understand resumes beyond keywords
  • Support conversational follow‑up questions
  • Scale to multiple resumes using vector search

The system is designed as a production‑style RAG application, not just a notebook experiment.


Architecture (High‑Level)

PDF Resume(s)
     ↓
Document Loader (PyPDF)
     ↓
Text Chunking
     ↓
Embeddings (Sentence Transformers)
     ↓
FAISS Vector Store
     ↓
Retriever (Top‑K Chunks)
     ↓
Prompt Construction
     ↓
LLM (Mistral‑7B via Hugging Face)
     ↓
Answer (Streaming via Gradio UI)

Features

  • PDF Upload & Indexing
  • Semantic Resume Search (FAISS)
  • Conversational Q&A (Chat Interface)
  • Context‑Aware Responses (RAG)
  • Streaming LLM Responses
  • Notebook for Experimentation + Python App for Production

🖥️ User Interface Preview

Screenshot 2026-01-15 at 1 36 24 PM

Below is a screenshot of the running application showing the resume screening analysis output after providing job requirements. The interface allows users to enter job criteria, analyze multiple resumes, and view detailed, explainable decisions generated by the RAG pipeline.



📁 Repository Structure

RAG-Powered-Resume-Screening-Assistant/
│
├── rag-testing.ipynb        # Notebook: experimentation, testing, and pipeline validation
├── app.py                  # Gradio web application (contains full RAG logic)
├── vectorstore/             # Saved FAISS index (generated at runtime)
│   ├── index.faiss
│   └── index.pkl
├── requirements.txt         # Python dependencies
└── README.md                # Project documentation

Core Components Explained

1. Notebook: RAG Prototyping (rag-testing.ipynb)

The notebook is used for research, experimentation, and validation of the RAG pipeline before integrating it into the application.

It includes:

  • Loading and parsing resume PDFs
  • Chunking text with overlap
  • Generating embeddings
  • Building and querying a FAISS vector store
  • Testing prompts and model responses

This notebook serves as the development and learning environment for the project.


2. Application Logic (app.py)

All production logic is consolidated into a single Python file for simplicity and clarity.

Responsibilities handled inside app.py:

  • PDF upload handling
  • Text chunking and embedding generation
  • FAISS vector store creation and loading
  • Semantic retrieval (top‑K relevant chunks)
  • Prompt construction using retrieved context
  • LLM invocation via Hugging Face Endpoint
  • Streaming responses to the UI

This design reflects a self‑contained RAG application, suitable for demos and portfolio projects.

Example Use Cases

  • Resume screening & analysis
  • Candidate shortlisting
  • HR knowledge assistants
  • Internal knowledge bases

Why RAG (Retrieval‑Augmented Generation)?

  • Prevents hallucination
  • Grounds answers in real documents
  • Scales to large document collections
  • Industry‑standard architecture for LLM apps

Tech Stack

  • Python 3.10+
  • LangChain
  • FAISS
  • Sentence Transformers
  • Hugging Face Inference API
  • Gradio
  • PyPDF

⚠️ Important Usage Note (Local Resume Upload)

This project currently does not support uploading resume PDFs directly through the web interface.

  • Resume PDFs must be available locally on the machine running the application.
  • The application indexes resumes from the local file system as part of the RAG pipeline.
  • This design choice keeps the project simple and focused on demonstrating core RAG concepts rather than file-storage infrastructure.

🔧 Future improvement: Add web-based resume upload and persistent storage for fully remote usage.


 Limitations

  • Uses remote LLM (internet required)
  • Vectorstore stored locally (single‑machine)
  • No user authentication (single session)

 License

This project is for educational and portfolio purposes.


 Acknowledgments

  • LangChain
  • Hugging Face
  • FAISS
  • Gradio

About

RAG-based resume screening assistant using LangChain, FAISS, and Hugging Face LLMs, enabling semantic resume search and accurate Q&A over PDF resumes via an interactive Gradio interface.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors