This repository contains the official code and pretrained models for SOFI (multi-Scale defOrmable transFormer for camera calibratIon with enhanced line queries)
Camera calibration estimates the camera parameters, such as the zenith vanishing point and horizon line. In addition, estimating the camera parameters allows other tasks like 3D rendering, artificial reality effects, and object insertion in an image. Transformer-based models have provided promising results; however, they lack cross-scale interaction. In this work, we introduce multi-Scale defOrmable transFormer for camera calibratIon with enhanced line queries, SOFI. SOFI improves the line queries used in CTRL-C and MSCC by using line content and geometric features. Moreover, SOFI's line queries allow transformer models to adopt the multi-scale deformable attention mechanism to promote cross-scale interaction between the feature maps produced by the backbone. SOFI outperforms existing methods on the Google Street View, Horizon Line in the Wild, and Holicity datasets while keeping a competitive inference speed.
Model | Up Dir (◦) | Pitch (◦) | Roll (◦) | FoV (◦) | AUC (%) | URL |
---|---|---|---|---|---|---|
Official Implementation | ||||||
CTRL-C | 1.80 | 1.58 | 0.66 | 3.59 | 87.29 | |
MSCC | 1.72 | 1.50 | 0.62 | 3.21 | ||
Ours | ||||||
CTRL-C | 1.71 | 1.52 | 0.57 | 3.38 | 87.16 | |
MSCC | 1.75 | 1.56 | 0.58 | 3.04 | 87.63 | |
SOFI | 1.64 | 1.51 | 0.54 | 3.09 | 87.87 | checkpoint |
Model | Up Dir (◦) | Pitch (◦) | Roll (◦) | FoV (◦) | AUC (%) |
---|---|---|---|---|---|
Ours | |||||
CTRL-C | 2.66 | 2.26 | 1.09 | 3.38 | 72.31 |
MSCC | 2.28 | 1.87 | 1.08 | 1.08 | 77.43 |
SOFI | 2.23 | 1.75 | 1.16 | 1.16 | 82.96 |
Model | AUC (%) |
---|---|
Ours | |
CTRL-C | 46.37 |
MSCC | 47.28 |
SOFI | 49.69 |
-
Clone this repository.
git clone https://github.com/SebastianJanampa/SOFI.git cd SOFI
-
Install Pytorch and torchvision
Follow the instructions on https://pytorch.org/get-started/locally/.
# an example: conda install -c pytorch pytorch torchvision
-
Install other needed packages
pip install -r requirements.txt
-
Compiling CUDA operators
cd models/sofi/ops python setup.py build install # unit test (should see all checking is True) python test.py cd ../../..
mkdir dataset data_csv
- Please download Google Street View dataset, Horizon Line in the Wild (HLW) , and Holicity datasets and organize them as following:
SOFI/
├── data/
│ ├── google_street_view_191210
│ ├── hlw
│ └── holicity
│
└── data_csv/
├── gsv_train_20210313.csv
├── gsv_val_20210313.csv
├── gsv_test_20210313.csv
├── hlw_test.csv
├── holicity-test-split.csv
└── gsv_train_20210313
For Holicity, you need to download the files: image, camera, vanishing points.
- Then, run the following command
python scripts/dataset/extract_segments.py
- Training
bash scripts/train/sofi.sh
- Testing
bash scripts/train/sofi.sh dataset
Supported datasets: gsv, hlw and holicity
- Compute metrics
bash results.py --dataset dataset
If you use this code for your research, please cite our paper:
@InProceedings{Janampa_BMVC2024,
Title = {{SOFI: Multi-Scale Deformable Transformer for Camera Calibration with Enhanced Line Queries}},
Author = {Sebastian Janampa Student and Marios Pattichis},
Booktitle = {35th British Machine Vision Conference 2025, {BMVC} 2025, Glasgow, UK, November 25-28, 2024},
Year = {2024},
}