Splatter Image: Ultra-Fast Single-View 3D Reconstruction Improvement by Training on RGB-D Dataset

This repository presents the Splatter Image framework, an ultra-fast approach for single-view 3D object reconstruction. The approach operates at 38 FPS and is based on Gaussian Splatting, a novel method that has shown success in multi-view reconstruction for real-time rendering, fast training, and scaling. Our research extends this method to monocular reconstruction by incorporating additional depth information into the model during training.

The Splatter Image framework modifies the UNet architecture, integrating depth channels to enhance 3D object reconstruction quality, significantly improving reconstruction metrics like PSNR, SSIM, and LPIPS across multiple datasets.

Ground Truth Model	RGB Baseline Reconstruction
RGB+D DepthAnything Reconstruction	RGB+D Splatter-Image Reconstruction

✨ Key Features

🔄 Monocular 3D object reconstruction using a fast feed-forward neural network.
🛠️ Integration of depth channels to improve the quality of reconstructions.
🧪 Evaluation of the approach on multiple datasets including SRN Cars and CO3D Cars.
📊 Quantitative improvements measured using PSNR, SSIM, and LPIPS.

📚 Datasets

The project evaluates the performance of the Splatter Image framework on the following datasets:

SRN Cars
- Subsets used: 100%, 50%, 20%
CO3D Cars with Background

For each dataset, we tested baseline models using only RGB inputs, followed by models that integrate depth information.

🧪 Experimental Setup

We conducted experiments with two depth configurations:

RGB+D using Splatter Image Depth Output: Depth maps were generated by the Splatter Image model itself.
RGB+D using Depth Anything Model Output: Depth maps were generated using external depth estimation models, providing more robust depth predictions.

For each dataset, results were evaluated based on the following metrics:

PSNR (Peak Signal-to-Noise Ratio)
SSIM (Structural Similarity Index)
LPIPS (Learned Perceptual Image Patch Similarity)

📊 Results

The performance of the Splatter Image framework showed improvements in reconstruction quality when depth information was integrated. The results are summarized below:

Dataset	Configuration	PSNR	SSIM	LPIPS
SRN Cars (100%)	Baseline (RGB only)	19.5569	0.8334	0.2559
	RGB+D using Splatter Image Depth Output	18.9316	0.8244	0.2639
	RGB+D using Depth Anything Model Output	19.4645	0.8361	0.2530
SRN Cars (50%)	Baseline (RGB only)	19.5290	0.8326	0.2539
	RGB+D using Splatter Image Depth Output	18.9742	0.8225	0.2651
	RGB+D using Depth Anything Model Output	19.4829	0.8374	0.2494
SRN Cars (20%)	Baseline (RGB only)	19.3081	0.8298	0.2554
	RGB+D using Splatter Image Depth Output	18.7255	0.8193	0.2663
	RGB+D using Depth Anything Model Output	19.3170	0.8329	0.2567
CO3D Cars	Baseline (RGB only)	14.0015	0.3806	0.6762
	RGB+D using CO3D Depth Output	13.9242	0.3730	0.6883

Ground Truth

RGB Baseline

RGB+D DepthAnything

RGB+D Splatter-Image

🚀 How to Run

Prerequisites

Operating System: Windows 11
Python Version: Python 3.8
CUDA Toolkit: CUDA 11.7
Anaconda/Miniconda: For managing the Python environment

Installation Steps

1. Install Dependencies

Git: Install Git from the official website.
Anaconda: Install Anaconda or Miniconda from the official website.
Visual Studio 2019 Community: Download and install from the official website. During installation, select "Desktop Development with C++".
CUDA Toolkit v11.7: Download and install from the NVIDIA website.
COLMAP: Install COLMAP as per the official instructions.
ImageMagick: Install from the official website.
FFmpeg: Install from the official website.

2. Clone the Repository

git clone https://github.com/dinakog/CV_Project-Splatter_Image
cd CV_Project-Splatter_Image

3. Create and Activate Conda Environment

conda create --name splatter-image python=3.8
conda activate splatter-image

4. Install Dependencies from conda_requirements

conda install --file conda_requirements.txt

5. Install PyTorch and CUDA

conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.7 -c pytorch -c nvidia

6. Set Up Visual Studio Environment

Open the Command Prompt and run:

"C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Auxiliary\Build\vcvars64.bat"
set DISTUTILS_USE_SDK=1

7. Verify CUDA Setup

Create a file named cuda_check.py with the following content:

import torch
print(torch.cuda.is_available())
print(torch.cuda.device_count())
print(torch.cuda.get_device_name(0))

8. Run the script to verify CUDA is working:

python cuda_check.py

Running the Training Process

The training process involves two steps:

Step 1: Initial Training Run the initial training script:

python train_network.py +dataset=cars/cars_co3d

Step 2: Update Configuration and Continue Training

Update Configuration File:

Open configs/experiment_configs/lpips_100k.yaml.
Update the load_network_path parameter with the path to the model created in Step 1.

Continue Training:

python train_network.py +dataset=cars/cars_co3d +experiment=lpips_100k.yaml

Running the Evaluation Process

To evaluate the trained model, use the following command:

python eval.py cars/cars_co3d --experiment_path <path_to_experiment>

Replace <path_to_experiment> with the actual path to your trained model.

Generating Visualizations (Optional)

If you wish to generate visualizations of the model's output, add the --save_vis flag:

python eval.py cars/cars_co3d --experiment_path <path_to_experiment> --save_vis <number_of_visualizations>

Replace <number_of_visualizations> with the desired number of visualizations to generate.

🚧 Future Work

Scale the Experiments: Due to limited resources, our experiments were constrained. We hypothesize that with more computational power and a larger dataset, we could maintain the observed improvement trends. Scaling the dataset and experimenting with more extensive training iterations would be the next step.

Improve Model Architecture: Investigate more advanced neural architectures that can dynamically adapt to varying depth inputs.

Multi-View Inputs: Extend the Splatter Image framework to handle multi-view inputs and real-time dynamic object reconstruction.

📚 References

Szymanowicz, S., et al. (2024). Splatter Image: Ultra-Fast Single-View 3D Reconstruction. arXiv preprint, arXiv:2312.13150.

Kerbl, B., et al. (2024). Gaussian Splatting for Real-Time Rendering. Graph Deco Inria GitHub Repository.

Li, J., Li, Y., & Zhang, L. (2023). Depth Anything: Plug-and-Play Supervised Depth Estimation with Pretrained Foundation Models. arXiv preprint, arXiv:2307.06661.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.idea		.idea
CO3D		CO3D
GSO		GSO
NMR		NMR
OBJAVERSE		OBJAVERSE
OBJAVERSE_LVIS_ANNOTATION		OBJAVERSE_LVIS_ANNOTATION
configs		configs
data_preprocessing		data_preprocessing
datasets		datasets
demo_examples		demo_examples
gaussian_renderer		gaussian_renderer
scene		scene
submodules		submodules
tests		tests
utils		utils
.gitignore		.gitignore
Baseline_summary.xlsx		Baseline_summary.xlsx
LICENSE		LICENSE
ProjectSummary.pdf		ProjectSummary.pdf
README.md		README.md
cars_test_scores.json		cars_test_scores.json
conda_requirements.txt		conda_requirements.txt
create_depth_dataset.py		create_depth_dataset.py
eval.py		eval.py
gradio_app.py		gradio_app.py
gradio_config.yaml		gradio_config.yaml
scores.txt		scores.txt
train_network.py		train_network.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Splatter Image: Ultra-Fast Single-View 3D Reconstruction Improvement by Training on RGB-D Dataset

✨ Key Features

📚 Datasets

🧪 Experimental Setup

📊 Results

🚀 How to Run

Prerequisites

Installation Steps

1. Install Dependencies

2. Clone the Repository

3. Create and Activate Conda Environment

4. Install Dependencies from conda_requirements

5. Install PyTorch and CUDA

6. Set Up Visual Studio Environment

7. Verify CUDA Setup

8. Run the script to verify CUDA is working:

Running the Training Process

Running the Evaluation Process

Generating Visualizations (Optional)

🚧 Future Work

📚 References

About

Releases

Packages

Contributors 2

Languages

License

dinakog/CV_Project-Splatter_Image

Folders and files

Latest commit

History

Repository files navigation

Splatter Image: Ultra-Fast Single-View 3D Reconstruction Improvement by Training on RGB-D Dataset

✨ Key Features

📚 Datasets

🧪 Experimental Setup

📊 Results

🚀 How to Run

Prerequisites

Installation Steps

1. Install Dependencies

2. Clone the Repository

3. Create and Activate Conda Environment

4. Install Dependencies from conda_requirements

5. Install PyTorch and CUDA

6. Set Up Visual Studio Environment

7. Verify CUDA Setup

8. Run the script to verify CUDA is working:

Running the Training Process

Running the Evaluation Process

Generating Visualizations (Optional)

🚧 Future Work

📚 References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages