Skip to content

HariPrasanth-K/JobMatch-AI

Repository files navigation

Resume Category Prediction System

A machine learning-based web application that automatically predicts the job category of a resume. The system uses Natural Language Processing (NLP) techniques and a trained Support Vector Machine (SVM) model to classify resumes into relevant professional domains.

This project is designed to assist recruiters and hiring platforms by automating the resume screening process and improving efficiency in candidate evaluation.


Key Features

  • Upload resumes in PDF, DOCX, or TXT formats
  • Automatic text extraction from uploaded files
  • Resume text preprocessing and cleaning using NLP techniques
  • TF-IDF vectorization for feature extraction
  • Machine learning-based classification using SVM model
  • Instant prediction of job category
  • Option to view extracted resume text for verification
  • Interactive web interface built with Streamlit

Model Details

  • Algorithm: Support Vector Machine (SVM)
  • Feature Extraction: TF-IDF Vectorization
  • Label Encoding: Used to convert categorical labels into numerical format
  • Trained on labeled resume dataset for multi-class classification

Project Workflow

  1. User uploads a resume file (PDF, DOCX, or TXT)
  2. The system extracts raw text from the file
  3. Text is cleaned using preprocessing techniques (removal of URLs, symbols, special characters, etc.)
  4. TF-IDF vectorizer converts text into numerical features
  5. Trained SVM model predicts the most relevant job category
  6. The predicted category is displayed on the interface

Requirements

  • Python
  • Streamlit (Web Application Framework)
  • Scikit-learn (SVM Classifier, TF-IDF Vectorizer, Label Encoder)
  • Natural Language Processing (Text preprocessing and cleaning)
  • PyPDF2 (PDF text extraction)
  • python-docx (DOCX file parsing)
  • Regular Expressions (Text cleaning)

About

The Resume Category Prediction System is a machine learning and NLP-based web application that classifies resumes into relevant job categories. It extracts and processes text from PDF, DOCX, and TXT files, applies TF-IDF vectorization, and uses an SVM model for prediction. The tool streamlines and automates resume screening efficiently.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors