Employee Attrition Classification Analysis

This repository contains a classification analysis project aimed at predicting employee attrition (whether an employee will leave the organization or stay) based on various demographic, job satisfaction, and performance attributes.

Problem Statement

The goal of this project is to predict employee attrition using classification algorithms. We aim to build a model that can predict whether an employee will leave the organization (attrition) or stay, based on their characteristics such as demographics, job satisfaction, and performance metrics.

Dataset

The dataset used in this project is related to employee data and contains features such as:

Age
Job role
Salary
Years at the company
Satisfaction levels
Distance from home, etc.

The target variable is Attrition, which indicates whether the employee left the organization (Yes) or stayed (No).

Key Features

Age: Age of the employee.
Attrition: Target variable indicating employee attrition.
BusinessTravel: Frequency of business travel.
DailyRate: Daily rate of the employee.
DistanceFromHome: Distance of the employee’s residence from the office.
Education: Level of education.
EmployeeCount: Number of employees in the company.
JobSatisfaction: Job satisfaction level of the employee.
YearsAtCompany: Number of years the employee has worked at the company.

Approach

1. Data Preprocessing

Handle class imbalance using class weights.
Normalize numerical features.
Split the dataset into training and testing sets.

2. Exploratory Data Analysis (EDA)

Visualize feature relationships with the target variable.
Identify key features influencing employee attrition.
Detect and address outliers.

3. Model Implementation

Support Vector Machine (SVM)
Logistic Regression
Decision Trees
Random Forest

4. Model Evaluation

Accuracy
Precision
Recall
F1 score
ROC-AUC

Installation

To run this project locally, follow these steps:

Clone the repository:
```
git clone <repository_url>
```
Navigate to the project directory:
```
cd Employee_Attrition
```

Install the required libraries:

using pip install pandas numpy seaborn matplotlib statsmodels scikit-learn

Requirements

Python 3.x
pandas
numpy
matplotlib
seaborn
scikit-learn

Usage

Import the dataset:

df = pd.read_csv('employee_attrition.csv')

Perform preprocessing and feature encoding:

# Change target column 'Attrition' to numerical values
df['Attrition'] = df['Attrition'].map({'Yes': 1, 'No': 0})

Split the data into training and testing sets:

X = df.drop('Attrition', axis=1)
y = df['Attrition']
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)

Train and evaluate models (e.g., SVM):

from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler

# Feature Scaling
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# SVM model
svm_model = SVC(kernel='linear', probability=True, class_weight='balanced')
svm_model.fit(X_train, y_train)

# Make predictions and evaluate metrics
y_pred = svm_model.predict(X_test)

Generate evaluation metrics:

from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, roc_auc_score

print(f'Accuracy: {accuracy_score(y_test, y_pred):.3f}')
print(f'Precision: {precision_score(y_test, y_pred):.3f}')
print(f'Recall: {recall_score(y_test, y_pred):.3f}')
print(f'F1 Score: {f1_score(y_test, y_pred):.3f}')
print(f'ROC-AUC: {roc_auc_score(y_test, svm_model.predict_proba(X_test)[:,1]):.3f}')

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
images		images
.gitignore		.gitignore
Classification Analysis Report.pdf		Classification Analysis Report.pdf
README.md		README.md
employee_attrition.csv		employee_attrition.csv
index.ipynb		index.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Employee Attrition Classification Analysis

Problem Statement

Dataset

Key Features

Approach

1. Data Preprocessing

2. Exploratory Data Analysis (EDA)

3. Model Implementation

4. Model Evaluation

Installation

Requirements

Usage

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

sofc-T/employee_attrition

Folders and files

Latest commit

History

Repository files navigation

Employee Attrition Classification Analysis

Problem Statement

Dataset

Key Features

Approach

1. Data Preprocessing

2. Exploratory Data Analysis (EDA)

3. Model Implementation

4. Model Evaluation

Installation

Requirements

Usage

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages