This repository houses the computer vision and machine learning backend for the Badminton Match AI Analyser. It is a containerised Flask application designed to process high-speed sports footage, extracting player movements, shuttlecock trajectories, and classifying shots using a multi-stage deep learning pipeline.
This backend is designed for concurrent execution to minimise processing time. When a video is received, the pipeline splits into parallel threads before converging for sequential analysis.
-
π₯ Player Tracking (Parallel):
- Model: YOLOv11 (Detection) + StrongSORT (Tracking).
- Function: Detects players in every frame and assigns unique IDs (ReID) to track them consistently throughout the rally, handling occlusions and crossovers.
- Output:
tracks.csvcontaining bounding boxes and IDs.
-
πΈ Shuttlecock Tracking (Parallel):
- Model: TrackNetV3.
- Function: A specialised deep learning model designed to track small, fast-moving objects (the shuttlecock) against complex backgrounds.
- Output:
ball.csvcontaining X, Y coordinates and visibility confidence.
-
π₯ Contact Point Detection (Sequential):
- Model: Custom 1D CNN.
- Function: Analyses the shuttlecock's trajectory (velocity, acceleration, and directional changes) to identify the exact frames where a racket strikes the shuttle.
- Output: A list of "hit frames" (timestamps).
-
π¬ Action Recognition (Sequential):
- Model: SlowFast (ResNet-101 backbone).
- Function: For every detected contact point, this model examines a 2-second temporal window. It uses a "Slow" pathway for spatial detail and a "Fast" pathway for motion context to classify the shot (e.g., Smash, Drop, Clear, Lift).
- Output:
events.jsoncontaining shot labels, timestamps, and confidence scores.
-
π¨ Overlay & Visualisation:
- Tools: OpenCV & FFmpeg.
- Function: Merges all data sources to render a final MP4 video with bounding boxes, shuttle trails, and shot labels overlaid for user analysis.
- Examples: Several examples of the final output can be found in the 'example_output' folder.
The codebase has been refactored for modularity and scalability:
mds06-backend/
βββ app/
β βββ api/ # Flask Route definitions (/health, /process)
β βββ core/ # Pipeline orchestrator & Singleton Model Loader
β βββ services/ # Logic for each pipeline step (tracking, shuttle, action)
β βββ utils/ # Helpers for Video, GCS, and Geometry
β βββ config.py # Centralised configuration & Env vars
β βββ schemas.py # Data classes for strict typing
βββ models/ # Directory for weights (populated at build time)
βββ TrackNetV3/ # Submodule for TrackNet logic
βββ Dockerfile # Multi-stage build definition
βββ main.py # Application entry point (Gunicorn target)
βββ requirements.txt # Python dependencies
We utilise a suite of State-of-the-Art (SOTA) models, trained and fine-tuned using our own data.
| Task | Model | Params | Description | Training |
|---|---|---|---|---|
| Player Detection | YOLOv11s | 9.4 million | Lightweight, real-time object detection tuned for humans in sports courts. | Fine-tuned on 685 high frame rate and standard frame rate images. |
| Player Tracking | osnet_x1_0 | 2.2 million | Uses OSNet for Re-Identification (ReID) using StrongSort algorithm to keep track of players even when they cross paths. | Fine-tuned on 26 high frame rate and standard frame rate clips (with an average of 700 frames each). |
| Shuttle Tracking | TrackNetV3 | NA | A segmentation-based network capable of tracking the shuttlecock even with motion blur. | Used out-of-the-box; peak performance. |
| Action Classification | SlowFast r101 | 62.83 million | A two-stream CNN architecture that captures the high-speed motion of badminton strokes better than standard 3D CNNs. | Fine-tuned on a total of 4621 clips across 13 classes (high and standard frame rate). |
| Contact Detection | 1D CNN | 106,626 | A custom lightweight heuristic model trained on trajectory data to identify contact frames and filter false positives. | Trained from scratch on 32 rallies, 585 shots. |
Note on Storage: To keep the repository light, large model weights (
.pt,.pth) are not stored in Git. They are downloaded automatically from GitHub Releases during the Docker build process.
This project is divided into three main repositories to handle the frontend, backend logic, and model development separately:
- Frontend: https://github.com/bryanlje/mds06-frontend
- Backend API (This Repo): https://github.com/bryanlje/mds06-backend
- Model Training & Data Preparation: https://github.com/bryanlje/mds06-ml
- Docker Desktop installed.
- GPU (Optional but recommended). If running on CPU, inference will be significantly slower.
git clone [https://github.com/bryanlje/mds06-backend.git](https://github.com/bryanlje/mds06-backend.git)
cd mds06-backend
This step includes downloading the model weights (~500MB+), so it may take a few minutes.
docker build -t badminton-analyser .
docker run -p 8080:8080 \
-e PORT=8080 \
-e WORKERS=1 \
badminton-analyser
You can send a test request using curl:
curl -X POST http://localhost:8080/process \
-F "video=@/path/to/your/test_match.mp4" \
-F "batch_size=16"
This container is optimised for Cloud Run with GPU support (if available) or high-memory CPU instances.
- Authenticate & Config:
gcloud auth login
gcloud config set project [YOUR_PROJECT_ID]
- Submit Build to Container Registry:
gcloud builds submit --tag gcr.io/[YOUR_PROJECT_ID]/badminton-analyser
- Deploy to Cloud Run:
gcloud run deploy badminton-backend \
--image gcr.io/[YOUR_PROJECT_ID]/badminton-analyser \
--platform managed \
--region us-central1 \
--memory 8Gi \
--cpu 4 \
--timeout 900 # Analysis takes time!
Ensures the container is running and models are loaded into memory.
- Endpoint:
GET /api/health - Response:
{
"status": "healthy",
"device": "cuda",
"models_loaded": { "yolo": true, "slowfast": true, ... }
}
Triggers the full analysis pipeline.
-
Endpoint:
POST /api/process -
Content-Type:
multipart/form-data -
Parameters:
-
video: The video file (binary). -
gcs_uri(Optional): Alternatively, provide ags://link to a video file. -
output_bucket(Optional): A GCS bucket name to automatically upload results to. -
job_id: Unique identifier for the job (for tracking). -
user_id: Unique identifier for the user. -
Response:
{
"job": { "job_id": "123", "user_id": "user_1" },
"events": [
{ "track_id": 1, "label": "Smash", "t0": 145, "t1": 155, "p": 0.98 }
],
"outputs": {
"tracks_csv": "gs://bucket/outputs/.../tracks.csv",
"overlay_video": "gs://bucket/outputs/.../overlay.mp4"
},
"timing": { "total_time": 145.5, "time_saved": 55.0 }
}
This backend is designed to work statelessly with a React frontend:
- Upload: User drags a video to the frontend.
- Request: Frontend gets a Signed URL (if using GCS directly) or streams the file to this backend.
- Processing: The backend runs the pipeline. This can take 1-3 minutes depending on video length.
- Display: The backend returns the
eventsJSON array immediately. The frontend uses this to generate the interactive statistics dashboard and timeline while the overlay video loads asynchronously.
This project was developed by Team MDS06:
- Bryan Leong Jing Ern
- Phua Yee Yen
- Ting Shu Hui
- Lee Jian Jun Thomas
Email - 2025mds06@gmail.com
Project Link: https://github.com/bryanlje/mds06-frontend