The Credit Card Fraud Detection System is a web application built using Python, Streamlit, and machine learning techniques such as PCA, Isolation Forest, and Local Outlier Factor (LOF). The system is designed to detect fraudulent transactions based on patterns in transaction data, with visualization capabilities and evaluation metrics to assist in identifying anomalies effectively.
- Data Upload: Upload your transaction data from Excel files.
- Data Visualization: Visualize the distribution of fraud and non-fraud transactions, along with transaction amounts.
- Data Preprocessing: Standardize data using
StandardScaler
and reduce dimensions usingPCA
. - Fraud Detection:
- Isolation Forest: Detect outliers based on unsupervised learning.
- Local Outlier Factor (LOF): Detect anomalies using a density-based approach.
- Model Evaluation: Evaluate model performance using metrics such as classification reports and ROC AUC scores.
- Detected Frauds: Display detected fraudulent transactions with visualization.
- Backend: Python (Streamlit)
- Machine Learning:
PCA
for dimensionality reductionIsolation Forest
andLocal Outlier Factor
for anomaly detection
- Data Visualization: Seaborn and Matplotlib
- File Handling: Pandas
-
Clone the repository:
git clone https://github.com/ms-shashank/detecting-fraudulent-transactions.git
-
Install the required packages:
pip install -r requirements.txt
-
Run the application:
streamlit run app.py
- Upload Transaction Data: The app accepts an Excel file as input. Ensure your data contains
Amount
andTime
columns. - Data Preprocessing: The app scales transaction data (
Amount
,Time
) and applies PCA for dimensionality reduction. - Fraud Detection:
- The Isolation Forest and LOF models are trained on the transformed data to detect anomalies.
- Detected fraudulent transactions are displayed with a visualization of anomalies.
- Model Evaluation: If your data includes the
Class
column, model performance is evaluated and displayed with metrics like the classification report and ROC AUC score.
- Upload your data: Choose an Excel file with transaction data.
- View Results:
- Visualizations for transaction distributions and detected anomalies.
- Detected fraudulent transactions.
- Evaluate Models: View model performance metrics if labeled data (
Class
) is available.