A production-grade, self-healing machine learning platform for Real-time Credit Card Fraud Detection. This platform integrates the entire ML lifecycleβfrom experiment tracking and model governance to autonomous retraining and explainable AI.
graph TD
subgraph "External Layer"
User[Investigator / Client]
end
subgraph "Serving Layer (FastAPI + Gunicorn)"
API[Secure API Endpoint]
Val[Data Validation - Pydantic]
API --> Val
end
subgraph "Intelligence Layer"
Champion[Champion Model - RF]
SHAP[Explainability Engine - SHAP]
Val --> Champion
Champion --> SHAP
end
subgraph "Governance & Storage"
MLflow[MLflow Model Registry]
Feast[Feature Store]
Champion --- MLflow
Champion --- Feast
end
subgraph "Monitoring & Maintenance"
Monitor[Drift Monitor - Evidently]
Heal[Self-Healing Orchestrator]
Monitor --> Heal
Heal --> MLflow
end
subgraph "UI Layer"
Dash[Streamlit Cockpit]
PDF[PDF Report Generator]
Dash --> API
Dash --> PDF
end
User --> API
User --> Dash
Monitor --- API
- π§ Explainable AI (SHAP): Every prediction comes with a detailed transparency report, showing exactly which features (amount, time, category) influenced the risk score.
- π Model Registry (Champion/Challenger): Professional governance using MLflow to manage model versions and aliases. The API automatically serves the "Champion" model.
- π Self-Healing Pipeline: An autonomous orchestrator that detects data drift and automatically triggers retraining and model promotion.
- π Management Cockpit: A beautiful Streamlit dashboard for fraud investigators to run manual checks, view SHAP visualizations, and monitor system health.
- π Professional PDF Reports: Generate and download comprehensive analysis reports for any transaction, featuring embedded SHAP charts, system flow diagrams, and audit details.
- π Dynamic Audit Logs: Real-time session-based history of all analyzed transactions, providing a traceable path for manual investigations.
- π‘οΈ Multi-Layer Validation: Incoming data is strictly validated via Pydantic and Great Expectations before reaching the model.
- β‘ Production Optimized: Containerized with Gunicorn and multi-worker Uvicorn for high-throughput and low-latency serving (<200ms).
| Component | Technology |
|---|---|
| Model Serving | FastAPI, Gunicorn, Uvicorn |
| ML Lifecycle | MLflow (Tracking & Registry) |
| Explainability | SHAP (Shapley Additive Explanations) |
| Monitoring | Evidently AI (Drift & Data Quality) |
| Management UI | Streamlit |
| Data Validation | Pydantic, Great Expectations |
| Feature Store | Feast |
| Orchestration | Python-based Autonomous Orchestrator |
| Infrastructure | Docker, Docker Compose |
| ML Core | Scikit-learn (Random Forest) |
END_TO_END_ML_PLATFORM/
βββ src/
β βββ serve_validated.py # Secured, Explainable API
β βββ train_advanced.py # MLflow-integrated training
β βββ orchestrator.py # Self-healing autonomous loop
β βββ dashboard.py # Streamlit Management Cockpit
β βββ promote_model.py # Automated Model Promotion
β βββ data_validation.py # Input validation rules
β βββ monitoring.py # Drift detection logic
βββ feature_repo/ # Feast Feature Store definitions
βββ models/ # Local model artifacts
βββ data/ # Training and Production datasets
βββ Dockerfile # Multi-worker production build
βββ docker-compose.yml # Full-stack orchestration
βββ mlflow.db # Model Registry database
The entire platform can be launched as a unified stack using Docker Compose:
# 1. Clone the repository
git clone https://github.com/LEVELING2108/End-to-End-ML-Platform---Fraud-Detection.git
cd End-to-End-ML-Platform---Fraud-Detection
# 2. Launch the full stack
docker-compose up --build- Fraud API:
http://localhost:8000/docs(RequiresX-API-KEY) - Management Dashboard:
http://localhost:8501 - MLflow Tracker:
http://localhost:5000
The prediction endpoints are secured. To interact with the API, include the following header:
X-API-KEY: fraud-detection-secret-key
- Throughput: ~50+ transactions per second (Horizontal scalable)
- Latency: Average ~120ms per prediction (v4.0.0 with Gunicorn workers)
- Stability: 100% success rate under 50 concurrent user load.
This project was originally inspired by and based on the FreeCodeCamp End-to-End ML Platform tutorial. It has since been expanded into a commercial-grade platform with the addition of SHAP explainability, automated model promotion, and self-healing orchestration.