Skip to content

LEVELING2108/End-to-End-ML-Platform---Fraud-Detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

32 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ›‘οΈ FraudShield: Enterprise End-to-End ML Platform

Python FastAPI MLflow Streamlit Docker SHAP Evidently

A production-grade, self-healing machine learning platform for Real-time Credit Card Fraud Detection. This platform integrates the entire ML lifecycleβ€”from experiment tracking and model governance to autonomous retraining and explainable AI.

πŸ—οΈ System Architecture

graph TD
    subgraph "External Layer"
        User[Investigator / Client]
    end

    subgraph "Serving Layer (FastAPI + Gunicorn)"
        API[Secure API Endpoint]
        Val[Data Validation - Pydantic]
        API --> Val
    end

    subgraph "Intelligence Layer"
        Champion[Champion Model - RF]
        SHAP[Explainability Engine - SHAP]
        Val --> Champion
        Champion --> SHAP
    end

    subgraph "Governance & Storage"
        MLflow[MLflow Model Registry]
        Feast[Feature Store]
        Champion --- MLflow
        Champion --- Feast
    end

    subgraph "Monitoring & Maintenance"
        Monitor[Drift Monitor - Evidently]
        Heal[Self-Healing Orchestrator]
        Monitor --> Heal
        Heal --> MLflow
    end

    subgraph "UI Layer"
        Dash[Streamlit Cockpit]
        PDF[PDF Report Generator]
        Dash --> API
        Dash --> PDF
    end

    User --> API
    User --> Dash
    Monitor --- API
Loading

πŸš€ Key Features

  • 🧠 Explainable AI (SHAP): Every prediction comes with a detailed transparency report, showing exactly which features (amount, time, category) influenced the risk score.
  • πŸ† Model Registry (Champion/Challenger): Professional governance using MLflow to manage model versions and aliases. The API automatically serves the "Champion" model.
  • πŸ”„ Self-Healing Pipeline: An autonomous orchestrator that detects data drift and automatically triggers retraining and model promotion.
  • πŸ“Š Management Cockpit: A beautiful Streamlit dashboard for fraud investigators to run manual checks, view SHAP visualizations, and monitor system health.
  • πŸ“„ Professional PDF Reports: Generate and download comprehensive analysis reports for any transaction, featuring embedded SHAP charts, system flow diagrams, and audit details.
  • πŸ“œ Dynamic Audit Logs: Real-time session-based history of all analyzed transactions, providing a traceable path for manual investigations.
  • πŸ›‘οΈ Multi-Layer Validation: Incoming data is strictly validated via Pydantic and Great Expectations before reaching the model.
  • ⚑ Production Optimized: Containerized with Gunicorn and multi-worker Uvicorn for high-throughput and low-latency serving (<200ms).

πŸ› οΈ Technology Stack

Component Technology
Model Serving FastAPI, Gunicorn, Uvicorn
ML Lifecycle MLflow (Tracking & Registry)
Explainability SHAP (Shapley Additive Explanations)
Monitoring Evidently AI (Drift & Data Quality)
Management UI Streamlit
Data Validation Pydantic, Great Expectations
Feature Store Feast
Orchestration Python-based Autonomous Orchestrator
Infrastructure Docker, Docker Compose
ML Core Scikit-learn (Random Forest)

πŸ“ Project Structure

END_TO_END_ML_PLATFORM/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ serve_validated.py    # Secured, Explainable API
β”‚   β”œβ”€β”€ train_advanced.py     # MLflow-integrated training
β”‚   β”œβ”€β”€ orchestrator.py       # Self-healing autonomous loop
β”‚   β”œβ”€β”€ dashboard.py          # Streamlit Management Cockpit
β”‚   β”œβ”€β”€ promote_model.py      # Automated Model Promotion
β”‚   β”œβ”€β”€ data_validation.py    # Input validation rules
β”‚   └── monitoring.py         # Drift detection logic
β”œβ”€β”€ feature_repo/             # Feast Feature Store definitions
β”œβ”€β”€ models/                   # Local model artifacts
β”œβ”€β”€ data/                     # Training and Production datasets
β”œβ”€β”€ Dockerfile                # Multi-worker production build
β”œβ”€β”€ docker-compose.yml        # Full-stack orchestration
└── mlflow.db                 # Model Registry database

⚑ Quick Start (Production Mode)

The entire platform can be launched as a unified stack using Docker Compose:

# 1. Clone the repository
git clone https://github.com/LEVELING2108/End-to-End-ML-Platform---Fraud-Detection.git
cd End-to-End-ML-Platform---Fraud-Detection

# 2. Launch the full stack
docker-compose up --build

Access Points:

  • Fraud API: http://localhost:8000/docs (Requires X-API-KEY)
  • Management Dashboard: http://localhost:8501
  • MLflow Tracker: http://localhost:5000

πŸ”’ API Security

The prediction endpoints are secured. To interact with the API, include the following header: X-API-KEY: fraud-detection-secret-key

πŸ“ˆ Performance Benchmarks

  • Throughput: ~50+ transactions per second (Horizontal scalable)
  • Latency: Average ~120ms per prediction (v4.0.0 with Gunicorn workers)
  • Stability: 100% success rate under 50 concurrent user load.

πŸ“š Acknowledgments & References

This project was originally inspired by and based on the FreeCodeCamp End-to-End ML Platform tutorial. It has since been expanded into a commercial-grade platform with the addition of SHAP explainability, automated model promotion, and self-healing orchestration.

About

πŸ›‘οΈ FraudShield: A production-ready, self-healing MLOps platform for real-time Credit Card Fraud Detection. Features SHAP explainability, MLflow model governance, autonomous retraining, and a professional Streamlit management cockpit. Fully containerized and CI/CD optimized.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors