Project Overview The Breast Cancer Analysis application is designed to predict whether a tumor is benign or malignant based on input features derived from breast cancer pathology data. The app employs a K-Nearest Neighbors (KNN) classifier, a popular machine learning algorithm, to make predictions based on user input.
Project Components
- Data Loading The app begins by loading the dataset containing breast cancer features. A CSV file (e.g., breast_cancer_data.csv) is used to store the data, which includes attributes like clump thickness, uniformity of cell size, and more.
- Data Preprocessing
- Missing Value Handling: The application employs SimpleImputer to handle missing values, using the mean of the columns for imputation.
- Feature and Target Separation: The features (X) and target variable (y) are separated, with the target variable indicating whether the tumor is benign or malignant.
- Train/Test Split: The data is split into training and testing sets using an 80-20 split ratio.
- Data Scaling: The features are standardized using StandardScaler to ensure that all input features contribute equally to the distance calculations in the KNN algorithm.
- Model Training
- A KNN classifier is instantiated and trained on the scaled training data. The chosen value of K is set to 7, but this can be modified based on cross-validation results.
- User Interaction
- The application features an interactive interface where users can input values for each of the cancer attributes using sliders.
- The attributes include:
- Clump Thickness
- Uniformity of Cell Size
- Uniformity of Cell Shape
- Marginal Adhesion
- Single Epithelial Cell Size
- Bare Nuclei
- Bland Chromatin
- Normal Nucleoli
- Mitoses
- Prediction
- When the user clicks the "Predict" button, the app:
- Takes the input data, scales it using the same scaler as the training data, and predicts the class using the KNN model.
- Displays the prediction results indicating whether the tumor is classified as "Benign" or "Malignant."
- When the user clicks the "Predict" button, the app:
- Results Interpretation
- The app explains the meaning of the predicted classes (Benign and Malignant) to the user, emphasizing the significance of medical intervention in case of a malignant prediction.
- Visual Styling
- The app includes CSS styling to enhance the user interface, including custom backgrounds, button styles, and text colors for improved readability.
- Project Structure
- The project typically consists of the following files:
- Breast_Cancer_Analysis_App.py: The main application script containing all the code for loading data, preprocessing, model training, and user interface.
- breast_cancer_data.csv: The dataset used for training and testing the model.
- The project typically consists of the following files:
- Conclusion
- The Breast Cancer Analysis application serves as a useful tool for educational purposes, allowing users to understand the classification of tumors based on critical features while also providing insight into the significance of benign and malignant classifications in the medical field.
- This project can also be extended with additional features, such as visualizations of the data distribution, model performance metrics, or alternative machine learning algorithms for comparison.
App Explore https://breast-cancer-analysis-app.streamlit.app/