This repository contains a reimplementation of the paper:
Background Matting V2: Real-Time High-Resolution Background Matting
Peter Lin, Cem Keskin, Shih-En Wei, Yaser Sheikh
arXiv preprint arXiv:2012.07810
The reimplementation introduces the following modifications:
- Datasets: Uses the P3M dataset for portrait images and the BG20K dataset for background images for base training data. Uses the VideoMatte240K, PhotoMatte85 and Backgrounds for refine training data.
- Backbone Network: Employs MobileNetV3 instead of MobileNetV2 for improved performance.
- Overview
- Features
- Requirements
- Installation
- Dataset Preparation
- Usage
- Batch Script Example
- Project Structure
- Contributing
- License
- Acknowledgments
- Contact
- References
This project focuses on reimplementing the Background Matting V2 model with enhancements:
- Datasets: Incorporates the P3M dataset for high-quality portrait images and the BG20K dataset for diverse background images.
- Backbone Network: Upgrades the backbone network to MobileNetV3 for better efficiency and accuracy.
The implementation is adapted to run on the University of Michigan's Great Lakes High-Performance Computing Cluster.
You can test the trained model online on huggingface.
- High-resolution background matting.
- Real-time inference capabilities.
- Utilizes MobileNetV3 as the backbone network.
- Uses P3M and BG20K datasets for base training.
- Uses the VideoMatte240K, PhotoMatte85 and Backgrounds for refine training.
- Adapted for use on HPC clusters, specifically the UMich Great Lakes cluster.
- Supports multiple inference backends:
- PyTorch (Research)
- TorchScript (Production)
- ONNX (Experimental)
- Pretrained Models: The project has released pretrained models, you can access on Google Drive for further research.
- Huggingface Instance: You can test the trained model online on huggingface.
In this project, I tested the performance of the trained model in Background Matting and Background Matting V2 Footage ,and compared it with the results of the original model. Here are some key results:
Origin Model | Reimplementation |
---|---|
All test results and train log are available on Google Drive.
- Python 3.8 or higher
- PyTorch 1.7 or higher
- CUDA 11.0 or higher (if using GPU acceleration)
- Additional Python packages:
torchvision
onnxruntime
numpy
opencv-python
-
Clone the Repository
git clone https://github.com/joycexjl/BackgroundMatting.git cd BackgroundMatting
-
Create a Virtual Environment
python -m venv venv source venv/bin/activate # On Windows, use venv\Scripts\activate
-
Install Dependencies
pip install --upgrade pip pip install -r requirements.txt
-
Connect to the Cluster
-
Load Modules
module purge module load python/3.8.2 module load cuda/11.0
-
Create a Virtual Environment
cd /home/your_umich_username/ python -m venv matting_env source matting_env/bin/activate
-
Install Dependencies
pip install --upgrade pip pip install torch torchvision onnxruntime numpy opencv-python
-
Download the Datasets
- P3M Dataset: Download Link
- BG20K Dataset: Download Link
Place the datasets on your local machine and change
data_path.py
to the corresponding path. -
Transfer the Datasets to the Cluster
scp -r /path/to/dataset [email protected]:/scratch/your_umich_username/dataset/
- Note: Use the
/scratch
directory for large datasets on the cluster.
- Note: Use the
-
Verify Data Integrity
Ensure that the data is correctly transferred and accessible.
You can test the model locally using the provided scripts.
python inference_images.py \
--model-type torchscript \
--model-backbone mobilenetv3 \
--model-backbone-scale 0.25 \
--model-refine-sample-pixels 80000 \
--model-path ./models/model.pth \
--src ./images/src_image.png \
--bgr ./images/bgr_image.png \
--output ./results/output.png
For basenet training:
python train_base.py \
--dataset-name p3m10k \
--background-dataset bg20k \
--model-backbone mobilenetv3 \
--model-name mattingbase-mobilenetv3-p3m10k \
--epoch-end 50
For refinenet training:
python train_refine.py \
--dataset-name videomatte240k \
--model-backbone mobilenetv3 \
--model-name mattingrefine-mobilenetv3-videomatte240k \
--model-last-checkpoint "checkpoints/checkpoint-xx.pth" \
--background-dataset backgrounds \
--batch-size 4
scp -r /path/to/BackgroundMatting [email protected]:/home/your_umich_username/
Create a batch script (e.g., run_matting.sh
) as described in the Batch Script Example section.
Submit the job:
sbatch run_matting.sh
Below is an example of a Slurm batch script for running the model on the Great Lakes cluster.
#!/bin/bash
#SBATCH --job-name=matting_job # Job name
#SBATCH --account=your_slurm_account # Slurm account
#SBATCH --partition=standard # Partition (queue)
#SBATCH --nodes=1 # Number of nodes
#SBATCH --ntasks=1 # Number of tasks
#SBATCH --cpus-per-task=8 # Number of CPU cores per task
#SBATCH --mem=32G # Total memory per node
#SBATCH --gres=gpu:1 # Number of GPUs per node
#SBATCH --time=24:00:00 # Time limit hrs:min:sec
#SBATCH --output=matting_%j.out # Standard output and error log
# Print some info
echo "Running on host $(hostname)"
echo "Job started at $(date)"
echo "Directory is $(pwd)"
# Load modules
module purge
module load python/3.8.2
module load cuda/11.0
# Activate virtual environment
source /home/your_umich_username/matting_env/bin/activate
# Navigate to project directory
cd /home/your_umich_username/BackgroundMatting/
# Run inference
python inference_images.py \
--model-type torchscript \
--model-backbone mobilenetv3 \
--model-backbone-scale 0.25 \
--model-refine-sample-pixels 80000 \
--model-path /home/your_umich_username/BackgroundMatting/models/model.pth \
--src /scratch/your_umich_username/dataset/P3M/test/src_image.png \
--bgr /scratch/your_umich_username/dataset/BG20K/test/bgr_image.png \
--output /scratch/your_umich_username/results/output.png
echo "Job ended at $(date)"
Instructions:
- Replace
your_slurm_account
with your actual Slurm account name. - Ensure that the paths to the source image, background image, and model are correct.
- Submit the script using
sbatch run_matting.sh
.
BackgroundMatting/
├── dataset/ # Contains datasets (P3M, BG20K)
├── doc/ # Documentation and notes
├── eval/ # Evaluation scripts and metrics
├── images/ # Sample images for testing
├── model/ # Model architecture scripts
│ ├── __init__.py
│ ├── MattingBase.py # Base matting model
│ ├── MattingRefine.py # Refined matting model
│ └── ... # Additional model files
├── .gitignore
├── LICENSE
├── README.md # Project documentation
├── data_path.py # Script to manage dataset paths
├── export_onnx.py # Script to export model to ONNX
├── export_torchscript.py # Script to export model to TorchScript
├── inference_images.py # Script for image inference
├── inference_speed_test.py # Script to test inference speed
├── inference_utils.py # Utility functions for inference
├── inference_video.py # Script for video inference
├── inference_webcam.py # Script for webcam inference
├── requirements.txt # Python dependencies
├── train_base.py # Training script for base model
├── train_refine.py # Training script for refined model
└── ... # Additional scripts and files
Contributions are welcome! Please open an issue or submit a pull request.
This project is licensed under the terms of the MIT license.
- Original Authors: Peter Lin, Cem Keskin, Shih-En Wei, Yaser Sheikh
- Original Repository: BackgroundMattingV2
- Datasets:
- P3M Dataset: https://paperswithcode.com/dataset/p3m-10k
- BG20K Dataset: https://paperswithcode.com/dataset/bg-20k
- University of Michigan ARC: For providing the Great Lakes cluster resources.
- External Support: I want to thank my supportive human @IMLLX for valuable discussions and help throughout the project.
For any questions or issues, please contact:
- Name: Joyce Liu
- Email: [email protected]
- Background Matting V2 Paper
- Original GitHub Repository
- UMich Great Lakes User Guide
- PyTorch Documentation
- P3M Dataset
- BG20K Dataset
- MobileNetV3 Paper
- Data Privacy: Ensure that any data used complies with data usage agreements and privacy laws.
- Resource Management: Be mindful of the resources requested when submitting jobs to the cluster to optimize scheduling and efficiency.
- Environment Modules: Use the module system on Great Lakes to manage software dependencies effectively.
-
Clone the Repository
git clone https://github.com/joycexjl/BackgroundMatting.git cd BackgroundMatting
-
Set Up Environment
python -m venv venv source venv/bin/activate pip install -r requirements.txt
-
Prepare Datasets
- Download the P3M and BG20K datasets.
- Place them in the
dataset/
directory.
-
Train the Model
python train_base.py \ --dataset-name p3m10k \ --background-dataset bg20k \ --model-backbone mobilenetv3 \ --model-name mattingbase-mobilenetv3-p3m10k \ --epoch-end 50
-
Run Inference Locally
python inference_images.py \ --model-type torchscript \ --model-backbone mobilenetv3 \ --model-backbone-scale 0.25 \ --model-refine-sample-pixels 80000 \ --model-path ./models/model.pth \ --src ./images/src_image.png \ --bgr ./images/bgr_image.png \ --output ./results/output.png
-
Prepare for Cluster Execution
- Transfer data and code to the cluster.
- Create and submit a batch script.
-
Monitor Job
squeue -u your_umich_username tail -f matting_JOBID.out
-
Retrieve Results
scp [email protected]:/scratch/your_umich_username/results/output.png /local/path/to/save/
Thank you for using this reimplementation. We hope it aids in your research and projects!
A1: MobileNetV3 offers improved performance and efficiency over MobileNetV2 due to architectural advancements. It achieves better accuracy with lower computational cost, making it suitable for real-time applications.
A2: In the training and inference scripts, specify the --model-backbone
parameter:
--model-backbone mobilenetv3
You can replace mobilenetv3
with other supported backbones if desired.
A3: Yes, use the inference_video.py
script to perform matting on videos:
python inference_video.py \
--model-type torchscript \
--model-backbone mobilenetv3 \
--model-path ./models/model.pth \
--src-video ./videos/src_video.mp4 \
--bgr-video ./videos/bgr_video.mp4 \
--output ./results/output_video.mp4
A4: Use the provided scripts:
-
Export to TorchScript:
python export_torchscript.py \ --model-backbone mobilenetv3 \ --model-path ./models/model.pth \ --output-path ./models/model_scripted.pt
-
Export to ONNX:
python export_onnx.py \ --model-backbone mobilenetv3 \ --model-path ./models/model.pth \ --output-path ./models/model.onnx