diff --git a/README.md b/README.md
index 03e2c71..9d18cf6 100644
--- a/README.md
+++ b/README.md
@@ -7,6 +7,7 @@
Official repository for the paper [Robust High-Resolution Video Matting with Temporal Guidance](https://peterl1n.github.io/RobustVideoMatting/). RVM is specifically designed for robust human video matting. Unlike existing neural models that process frames as independent images, RVM uses a recurrent neural network to process videos with temporal memory. RVM can perform matting in real-time on any videos without additional inputs. It achieves **4K 76FPS** and **HD 104FPS** on an Nvidia GTX 1080 Ti GPU. The project was developed at [ByteDance Inc.](https://www.bytedance.com/)
+
## News
@@ -34,7 +35,7 @@ All footage in the video are available in [Google Drive](https://drive.google.co
## Demo
* [Webcam Demo](https://peterl1n.github.io/RobustVideoMatting/#/demo): Run the model live in your browser. Visualize recurrent states.
* [Colab Demo](https://colab.research.google.com/drive/10z-pNKRnVNsp0Lq9tH1J_XPZ7CBC_uHm?usp=sharing): Test our model on your own videos with free GPU.
-
+* [Replicate Demo](https://replicate.com/arielreplicate/robust_video_matting): Test our model on Replicate UI/python API.
## Download
diff --git a/cog.yaml b/cog.yaml
new file mode 100644
index 0000000..e7a4f3f
--- /dev/null
+++ b/cog.yaml
@@ -0,0 +1,14 @@
+build:
+ gpu: true
+ python_version: 3.8
+ system_packages:
+ - libgl1-mesa-glx
+ - libglib2.0-0
+ python_packages:
+ - torch==1.9.0
+ - torchvision==0.10.0
+ - av==8.0.3
+ - tqdm==4.61.1
+ - pims==0.5
+
+predict: "predict.py:Predictor"
diff --git a/predict.py b/predict.py
new file mode 100644
index 0000000..d7a9707
--- /dev/null
+++ b/predict.py
@@ -0,0 +1,32 @@
+import torch
+from model import MattingNetwork
+from inference import convert_video
+
+from cog import BasePredictor, Path, Input
+
+
+class Predictor(BasePredictor):
+ def setup(self):
+ self.model = MattingNetwork('resnet50').eval().cuda()
+ self.model.load_state_dict(torch.load('rvm_resnet50.pth'))
+
+ def predict(
+ self,
+ input_video: Path = Input(description="Video to segment."),
+ output_type: str = Input(default="green-screen", choices=["green-screen", "alpha-mask", "foreground-mask"]),
+
+ ) -> Path:
+
+ convert_video(
+ self.model, # The model, can be on any device (cpu or cuda).
+ input_source=str(input_video), # A video file or an image sequence directory.
+ output_type='video', # Choose "video" or "png_sequence"
+ output_composition='green-screen.mp4', # File path if video; directory path if png sequence.
+ output_alpha="alpha-mask.mp4", # [Optional] Output the raw alpha prediction.
+ output_foreground="foreground-mask.mp4", # [Optional] Output the raw foreground prediction.
+ output_video_mbps=4, # Output video mbps. Not needed for png sequence.
+ downsample_ratio=None, # A hyperparameter to adjust or use None for auto.
+ seq_chunk=12, # Process n frames at once for better parallelism.
+ )
+ output_type = str(output_type)
+ return Path(f'{output_type}.mp4')