Data Cleaning & Analysis Assistant

This project automates the cleaning and analysis of messy CSV/Excel data. It identifies missing values, detects outliers, normalizes data, and generates insightful reports with visualizations.

Features

Handles missing values (fills with mean/median or "Unknown" for categorical data)
Detects outliers using the Interquartile Range (IQR) method
Normalizes numerical data using Min-Max Scaling
Generates statistical reports and visualizations (Matplotlib & Seaborn)
Saves the cleaned dataset for further analysis

Tech Stack

Python
Pandas (for data manipulation)
Matplotlib & Seaborn (for data visualization)
Jupyter Notebook (for interactive execution)

Installation

Clone the Repository:

git clone https://github.com/kbhavneet/data-cleaning-project.git
cd data-cleaning

Install Dependencies:

pip install pandas matplotlib seaborn openpyxl numpy scikit-learn

Run Jupyter Notebook:
```
jupyter notebook
```
- Open data_cleaning_final.ipynb and execute the cells.

Usage

Place Your Dataset (.csv or .xlsx) inside the project folder as sample_data.csv.
Run the Jupyter Notebook to clean and process the data.
View the Insights:
- Processed data is saved as cleaned_data.csv.
- Visualization plots display trends and distributions.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
README.md		README.md
cleaned_data.csv		cleaned_data.csv
data_cleaning_final.ipynb		data_cleaning_final.ipynb
image-1.png		image-1.png
image-2.png		image-2.png
image.png		image.png
sample_data.csv		sample_data.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Cleaning & Analysis Assistant

Features

Tech Stack

Installation

Usage

Example Output

Raw Data (Before Cleaning)

Salary Distribution Visualization

Cleaned Data (After Processing)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Data Cleaning & Analysis Assistant

Features

Tech Stack

Installation

Usage

Example Output

Raw Data (Before Cleaning)

Salary Distribution Visualization

Cleaned Data (After Processing)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages