A feature renderer for robust 3D feature point representation in camera relocalization.
    
    Webpage
    ·
    Report Bug
    ·
    Request Feature
  
This is the codebase accompanying the paper FaVoR: Features via Voxel Rendering for Camera Relocalization by Vincenzo Polizzi, Marco Cannici, Davide Scaramuzza, and Jonathan Kelly. Visit the project webpage for an overview.
If you use this code, please cite the following publication:
@InProceedings{Polizzi_2025_WACV,
    author    = {Polizzi, Vincenzo and Cannici, Marco and Scaramuzza, Davide and Kelly, Jonathan},
    title     = {FaVoR: Features via Voxel Rendering for Camera Relocalization},
    booktitle = {2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
    month     = {February},
    year      = {2025},
    pages     = {44-53},
    doi       = {10.1109/WACV61041.2025.00015}
}Camera relocalization methods range from dense image alignment to direct camera pose regression from a query image. Among these, sparse feature matching stands out as an efficient, versatile, and generally lightweight approach with numerous applications. However, feature-based methods often struggle with significant viewpoint and appearance changes, leading to matching failures and inaccurate pose estimates. To overcome this limitation, we propose a novel approach that leverages a globally sparse yet locally dense 3D representation of 2D features. By tracking and triangulating landmarks over a sequence of frames, we construct a sparse voxel map optimized to render image patch descriptors observed during tracking. Given an initial pose estimate, we first synthesize descriptors from the voxels using volumetric rendering and then perform feature matching to estimate the camera pose. This methodology enables the generation of descriptors for unseen views, enhancing robustness to view changes. We extensively evaluate our method on the 7-Scenes and Cambridge Landmarks datasets. Our results show that our method significantly outperforms existing state-of-the-art feature representation techniques in indoor environments, achieving up to a 39% improvement in median translation error. Additionally, our approach yields comparable results to other methods for outdoor scenarios while maintaining lower memory and computational costs.
- OS: Ubuntu 22.04
 - GPU: RTX 4060 or higher
 - Docker (Optional): For containerized environments
 - NVIDIA Container Toolkit (if using Docker)
 
If you choose to run our code using Docker, make sure you have Docker installed and the NVIDIA Container Toolkit, and you can skip the Requirements and Setup section and go directly to the Dataset Download section.
git clone --recursive https://github.com/utiasSTARS/FaVoR.git
cd FaVoRNote: If you forget --recursive, initialize submodules manually:
git submodule update --init --recursiveCreate a Conda environment and install dependencies: Note: if you want to use docker you can skip these passages and go directly to the Datasets Download section.
conda create -n favor python=3.10
conda activate favor
conda install -c "nvidia/label/cuda-11.7.0" cuda-toolkitInstall PyTorch and dependencies:
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117
pip install torch-scatter==2.1.1 -f https://data.pyg.org/whl/torch-1.13.1+cu117.html
pip install -r requirements.txtcd lib/cuda
./build.shWe used the 7-Scenes and Cambridge Landmarks datasets for our experiments. You need to download the dataset to run the code. Run the script to download datasets:
bash scripts/download_datasets.shTo download a specific scene:
bash scripts/download_datasets.sh SCENE_NAMENote: the script will create the folder datasets and download the datasets,
the NetVLAD matches, and
the COLMAP ground truth for the 7-Scenes dataset.
Scenes Available:
- 7-Scenes: chess, fire, heads, office, pumpkin, redkitchen, stairs
 - Cambridge Landmarks: college, court, hospital, shop, church
 
Ensure Docker and NVIDIA Container Toolkit are installed. Run the visualizer:
xhost +local:docker
docker run --net=host --rm -v ./logs/:/favor/logs -v ./datasets/:/favor/datasets --privileged --gpus all -e DISPLAY=$DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix -it viciopoli/favor:latest bash /favor/scripts/visualizer.sh SCENE_NAMEReplace SCENE_NAME with one from the dataset list above.
conda activate favor
bash scripts/visualize.sh SCENE_NAMEReplace SCENE_NAME with one from the dataset list above.
Run test scripts to reproduce results:
conda activate favor
bash scripts/test_7scenes.sh NETWORK_NAME
bash scripts/test_cambridge.sh NETWORK_NAMEReplace NETWORK_NAME with one of: alike-l, alike-n, alike-s, alike-t, superpoint.
To print the results:
python results.py --logs_dir ./logs/7Scenes --dataset 7scenes --net_model alike-lModify the --logs_dir, --dataset, and --net_model arguments as needed.
Pretrained models are available on the Hugging Face model hub.
Note: the tests scripts will automatically download the models if needed.
Single models can be downloaded using the Hugging Face CLI:
DATASET=7Scenes # or Cambridge
SCENE=chess # or ShopFacade etc.
NETWORK=alike-l # or alike-s, alike-n, alike-t, superpoint
huggingface-cli download viciopoli/FaVoR $DATASET/$SCENE/$NETWORK/model_ckpts/model_last.tar --local-dir-use-symlinks False --local-dir /path/to/your/directoryDATASET_NAME
    ├── SCENE_NAME
    │   ├── NETWORK_NAME
    │   │   ├── model_ckpts
    │   │   └── results
    ...Example (7-Scenes):
7scenes
    ├── chess_7scenes
    │   ├── alike-n
    │   │   ├── models
    │   │   └── tracks.pklDistributed under the Apache 2.0 License. See LICENSE for more information.
Built on DVGO.
Template by Best-README-Template.
