Changkun Liu, Shuai Chen, Yash Bhalgat, Siyan Hu, Ming Cheng, Zirui Wang, Victor Prisacariu and Tristan BRAUD
International Conference on Learning Representations (ICLR) 2025
We tested our code based on CUDA 12.1, PyTorch 2.5.1, and Python 3.11+.
Clone this repo:
git clone https://github.com/XRIM-Lab/GS-CPR.git
cd GS-CPR
cd ACT_Scaffold_GS
conda create -n scaffold_act python=3.11
conda activate scaffold_act
conda install pytorch torchvision pytorch-cuda=12.1 -c pytorch -c nvidia
pip install -r requirements.txt
# install Tiny-cuda-nn
pip install ninja
pip install git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch
# install depth rendering for 3DGS
git clone [email protected]:leo-frank/diff-gaussian-rasterization-depth.git
cd diff-gaussian-rasterization-depth
python setup.py install
You can download the pretrained 3DGS models from the provided link and unzip them in the folder GS-CPR/ACT_Scaffold_GS/data/
. You can download pretrained ACT MLP models from the provided link and put them in the folder GS-CPR/ACT_Scaffold_GS/logs/
.
ACT_Scaffold_GS
├── data
│ ├── cambridge
│ ├── 7scenes
│ ├── 12scenes
| ├── Cambridge_semantic
├── logs
| ├──paper_models
And then run the following command to render the synthetic images based on the coarse_poses
.
# generate rendered images based on coarse poses for 7Scenes
bash script_render_pred_7s.sh
# generate rendered images based on coarse poses for 12Scenes
bash script_render_pred_12s.sh
For Cambridge Landmarks dataset, we also need calibrated camera intrinsics files for each image before rendering, it will be prepared in Datasets section.
# generate rendered images based on coarse poses for Cambridge Landmarks
bash script_render_pred_cam.sh
NOTE: For 7scenes COLMAP files, we improve the accuracy of the sparse point cloud courtesy of Torsten Sattler, using rendered dense depth maps in HLoc tool box courtesy of Eric Brachmann for DSAC*. Then, we align all poses in sparse/0/images.txt
to SfM poses in ICCV 2021. For 12scenes COLMAP files, we utilize SfM models provided by ICCV 2021. For Cambridge Landmarks, we use SfM models from HLoc toolbox, courtesy of Torsten Sattler. All these COLMAP files and 3DGS pretrained models have been prepared in the above download link.
If you want to train new Scaffold-GS models, you need COLMAP format sparse/
file (please refer to the examples and data structure in the pretrained models). Additionally, when training new models, remember to comment out the line if len(cam_infos) <= 1:
in GS-CPR/ACT_Scaffold_GS/scene/dataset_readers.py
to ensure proper loading of camera info.
def readColmapCameras(cam_extrinsics, cam_intrinsics, images_folder):
cam_infos = []
for idx, key in enumerate(cam_extrinsics):
#if len(cam_infos) <= 1: #comment when training new 3DGS models
sys.stdout.write('\r')
# the exact output you're looking for:
sys.stdout.write("Reading camera {}/{}".format(idx+1, len(cam_extrinsics)))
sys.stdout.flush()
Run
#training Scaffold-gs models with ACT modules
bash train_scaffold_act.sh
#training Scaffold-gs models without ACT modules
bash train_scaffold.sh
Create the environment as same as MASt3R
cd GS-CPR
conda activate mast3r
This paper uses three public datasets:
Following ACE, we utilize the same scripts in the datasets
folder to automatically download and extract the data in a consistent format.
Important: make sure you have checked the license terms of each dataset before using it.
You can use the datasets/setup_{7,12}scenes.py
scripts to download the data.
As mentioned in our paper, we experimented Pseudo Ground Truth (PGT) camera poses obtained after running SfM on the scenes (see the ICCV 2021 paper,
and associated code for details).
To download and prepare the datasets using the PGT poses:
cd datasets
# Downloads the data to datasets/pgt_7scenes_{chess, fire, ...}
./setup_7scenes.py --poses pgt
# Downloads the data to datasets/pgt_12scenes_{apt1_kitchen, ...}
./setup_12scenes.py --poses pgt
You can follow ACE to download DSLAM poses and try.
Simply run:
cd datasets
# Downloads the data to datasets/Cambridge_{GreatCourt, KingsCollege, ...}
./setup_cambridge.py
#For 7Scenes
python gs_cpr_7s.py --pose_estimator ace --scene chess #for a specific scene
python gs_cpr_7s.py --pose_estimator ace --test_all #for the whole dataset
#For 12Scenes
python gs_cpr_12s.py --pose_estimator ace --scene apt1_kitchen
python gs_cpr_12s.py --pose_estimator ace --test_all #for the whole dataset
#For Cambridge Landmarks
python gs_cpr_cam.py --pose_estimator ace --scene ShopFacade
python gs_cpr_cam.py --pose_estimator ace --test_all #for the whole dataset
#For 7Scenes
python gs_cpr_7s_rel.py --pose_estimator dfnet --scene chess #for a specific scene
python gs_cpr_7s_rel.py --pose_estimator dfnet --test_all #for the whole dataset
#For Cambridge Landmarks
python gs_cpr_cam_rel.py --pose_estimator dfnet --scene ShopFacade
python gs_cpr_cam_rel.py --pose_estimator dfnet --test_all #for the whole dataset
You can check the refined poses for each query in txt
files and the statistic log
results in GS-CPR/outputs
.
If you find our work helpful, please consider citing:
@inproceedings{
liu2025gscpr,
title={{GS}-{CPR}: Efficient Camera Pose Refinement via 3D Gaussian Splatting},
author={Changkun Liu and Shuai Chen and Yash Sanjay Bhalgat and Siyan HU and Ming Cheng and Zirui Wang and Victor Adrian Prisacariu and Tristan Braud},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2025},
url={https://openreview.net/forum?id=mP7uV59iJM}
}
This project is developed based on several fantastic repos: Scaffold-GS, MASt3R, NeFeS, ACE and Depth for 3DGS. We thank the original authors for their excellent work.