Water Potability Prediction Project

Objective:

The project aims to predict water potability based on water quality attributes using machine learning techniques.

Data Loading and Exploration:
- Loaded the dataset from Water Quality and Potability on Kaggle and explored its features.
- Checked for missing values and explored the distribution of the target variable.
Data Preprocessing:
- Handled missing values through imputation.
- Checked for outliers and decided to ignore them for the initial analysis.
- Split the data into training and testing sets.
- Scaled numerical features.
Modeling:
- Trained individual models (Random Forest and Gradient Boosting).
- Explored feature importance for model interpretation.
- Created an ensemble model using the VotingClassifier.
Evaluation:
- Evaluated the models using accuracy, precision, recall, and F1-score.
- Compared the performance of individual models and the ensemble.

pH level and sulfate concentration are identified as the most influential features.
The ensemble model showed improved accuracy and precision for predicting potable water.

The dataset used in this project is sourced from Water Quality and Potability on Kaggle.

Feel free to contribute or reach out for further collaboration!

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
water_quality_and_potability.ipynb		water_quality_and_potability.ipynb