ML CI/CD Pipeline using GitHub Actions

This project implements a fully automated Continuous Integration and Continuous Deployment (CI/CD) pipeline tailored for machine learning workflows. It uses GitHub Actions to automate every stage of the ML lifecycle — from code validation and data integrity checks to model training, evaluation, versioning, and release generation.

Key Features

Automated CI/CD Workflow
- Runs on every push or pull request.
- Includes linting (flake8, black, ruff), testing (pytest), and model training.
End-to-End ML Automation
- Validates dataset structure and quality.
- Trains and evaluates ML models automatically.
Performance-Gated Model Promotion
- New models are only released if they outperform the previous version.
Artifact Management
- Automatically uploads the trained model and performance report as build artifacts.
Automated GitHub Releases
- Generates versioned releases with attached model files and reports.
Performance Dashboard
- Tracks historical accuracy and version data in a simple Markdown dashboard.
Containerized Environment
- Docker support ensures reproducibility across systems and CI runners.

Pipeline Overview

1. Trigger

The workflow runs automatically on:

Every push to the main branch
Every pull request targeting main

2. Stages

Stage	Description
Linting	Runs `flake8`, `black`, and `ruff` to enforce clean code.
Testing	Executes `pytest` unit tests to verify logic and data validation.
Training	Trains a model using dummy or uploaded data.
Evaluation	Compares accuracy against previous model performance.
Artifact Upload	Uploads the trained model and performance report.
Release Creation	Creates a GitHub release with model artifacts if accuracy improved.
Dashboard Update	Appends the latest accuracy and version info to `dashboard/performance.md`.

Repository Structure

ml-cicd/
├── .github/
│   └── workflows/
│       └── ci.yml                # Main CI/CD workflow
├── src/
│   ├── data_validation.py        # Checks data integrity and schema
│   ├── train.py                  # Handles model training and saving
│   ├── evaluate.py               # Evaluates model performance and compares with previous versions
│   └── dashboard.py              # Updates Markdown performance dashboard
├── tests/
│   └── test_pipeline.py          # Unit tests for validation and core logic
├── models/                       # Stored trained models (ignored by Git)
├── reports/                      # Stores performance reports
├── dashboard/                    # Performance tracking file
├── Dockerfile                    # Reproducible container for pipeline
├── .dockerignore                 # Excludes unnecessary files from Docker image
├── requirements.txt              # Python dependencies
├── .flake8                       # Lint configuration
├── .pre-commit-config.yaml       # Pre-commit hooks for lint/test
├── README.md                     # Project documentation
└── .gitignore                    # Ignored files

Getting Started

1. Clone the Repository

git clone https://github.com/<your-username>/ml-cicd.git
cd ml-cicd

2. Create and Activate Virtual Environment

python -m venv .venv
.\.venv\Scripts\Activate.ps1   # For PowerShell

3. Install Dependencies

pip install -r requirements.txt

4. Run Linting and Tests Locally

flake8 src
pytest

5. Run the Pipeline via Docker

docker build -t ml-cicd-pipeline .
docker run --rm ml-cicd-pipeline

GitHub Actions Workflow

The workflow file is located at:

.github/workflows/ci.yml

It executes the entire ML pipeline in the following order:

Code Quality Checks
Unit Tests
Data Validation
Model Training
Model Evaluation
Model Version Comparison
Artifact Upload & Release
Dashboard Update

You can view detailed logs under your repo’s Actions tab on GitHub.

Pre-Commit Hooks

Before each commit, pre-commit automatically checks:

Formatting (black)
Linting (flake8, ruff)
Code style compliance

To install hooks:

pre-commit install

Run manually anytime:

pre-commit run --all-files

Example Output

When the pipeline runs successfully, you’ll see logs like:

✅ Model trained successfully.
📁 Saved model at: models/model_v1.joblib
Model evaluation metrics: {'accuracy': 0.95}
🎉 Release created: v5

Future Enhancements

Add cloud artifact upload (AWS S3 / GCS)
Integrate model registry (MLflow / DVC)
Include experiment tracking and visualization
Expand dataset handling and retraining triggers

Project Completion Summary

This project successfully delivers a fully functional ML CI/CD pipeline built with GitHub Actions, Docker, and Python. It automates the entire ML workflow, from data validation and model training to version-controlled deployment and artifact management.

Goals Achieved

End-to-end automation for ML workflows
Code quality enforcement through linting and testing
Automated model evaluation, versioning, and releases
Reproducible Dockerized environment
Visual performance tracking dashboard

Outcome

This repository now serves as a complete MLOps foundation for automating and managing machine learning projects efficiently and transparently.

License

This project is licensed under the MIT License — feel free to modify and use it for your own MLOps workflows.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ML CI/CD Pipeline using GitHub Actions

Key Features

Pipeline Overview

1. Trigger

2. Stages

Repository Structure

Getting Started

1. Clone the Repository

2. Create and Activate Virtual Environment

3. Install Dependencies

4. Run Linting and Tests Locally

5. Run the Pipeline via Docker

GitHub Actions Workflow

Pre-Commit Hooks

Example Output

Future Enhancements

Project Completion Summary

Goals Achieved

Outcome

License

About

Uh oh!

Releases 4

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.github/workflows		.github/workflows
src		src
tests		tests
.dockerignore		.dockerignore
.flake8		.flake8
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
README.md		README.md
requirements.txt		requirements.txt

harshithluc073/ml-cicd

Folders and files

Latest commit

History

Repository files navigation

ML CI/CD Pipeline using GitHub Actions

Key Features

Pipeline Overview

1. Trigger

2. Stages

Repository Structure

Getting Started

1. Clone the Repository

2. Create and Activate Virtual Environment

3. Install Dependencies

4. Run Linting and Tests Locally

5. Run the Pipeline via Docker

GitHub Actions Workflow

Pre-Commit Hooks

Example Output

Future Enhancements

Project Completion Summary

Goals Achieved

Outcome

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Languages

Packages