Fourier Prompt Tuning for Modality-Incomplete Scene Segmentation [PDF]
In this work, we establish a task called Modality-Incomplete Scene Segmentation (MISS), which encompasses both system-level modality absence and sensor-level modality errors.
We introduce a Missing-aware Modal Switch (MMS) strategy to proactively manage missing modalities during training, utilizing bit-level batch-wise sampling to enhance the models's performance in both complete and incomplete testing scenarios. Furthermore, we introduce the Fourier Prompt Tuning (FPT) method to incorporate representative spectral information into a limited number of learnable prompts that maintain robustness against all MISS scenarios.To harness the benefits of adapters and minimize the number of parameters, we integrate our Fourier Prompt module into the AdaptFormer framework. FPT outperforms AdaptFormer with any number of channels in bottlenecks. Additionally, our MMS maintains the performance of both methods when all modalities are present and significantly enhances performance under conditions with missing modalities.
Refer to DeLiVER for environment installation and downloading DeLiVER dataset.
Calculate the depth maps for the Cityscapes dataset according to the instructions in RGBD_Semantic_Segmentation_PyTorch and store them as .npy
files.
dataset
folder should be structured as:
DELIVER
├── img
├── depth
├── missing
└── semantic
cityscapes
├── leftImg8bit
├── depth
├── missing
└── gtFine
The missing
folders contain black images with the same resolution and filenames as those in other modalities.
Please download the MultiMAE pretrained weights to the folder checkpoints/pretrained/
.
When training with MMS, change MISS
in configuration files from False to True.
cd path/to/MISS
conda activate cmnext
export PYTHONPATH="path/to/MISS"
python -m torch.distributed.launch --nproc_per_node=4 --use_env tools/train_prompt.py --cfg configs/config_fpt_deliver.yaml
python -m torch.distributed.launch --nproc_per_node=4 --use_env tools/train_prompt.py --cfg configs/config_fpt_cityscapes.yaml
Please download the following weights to the folders checkpoints/fpt/
and checkpoints/fpt_mms/
DeLIVER:
Model | mIoU(%) | weight |
---|---|---|
FPT | 57.81 | Google Drive |
FPT(MMS) | 57.38 | Google Drive |
Cityscapes:
Model | mIoU(%) | weight |
---|---|---|
FPT | 75.16 | Google Drive |
FPT(MMS) | 75.47 | Google Drive |
cd path/to/MISS
conda activate cmnext
export PYTHONPATH="path/to/MISS"
CUDA_VISIBLE_DEVICES=0 python tools/val_mm.py --cfg configs/config_fpt_deliver.yaml
CUDA_VISIBLE_DEVICES=0 python tools/val_mm.py --cfg configs/config_fpt_cityscapes.yaml
If you use our method in your project, please consider referencing
@INPROCEEDINGS{10588722,
author={Liu, Ruiping and Zhang, Jiaming and Peng, Kunyu and Chen, Yufan and Cao, Ke and Zheng, Junwei and Sarfraz, M. Saquib and Yang, Kailun and Stiefelhagen, Rainer},
booktitle={2024 IEEE Intelligent Vehicles Symposium (IV)},
title={Fourier Prompt Tuning for Modality-Incomplete Scene Segmentation},
year={2024},
volume={},
number={},
pages={961-968},
keywords={Training;Rain;Source coding;Semantic segmentation;Semantics;Switches;Benchmark testing},
doi={10.1109/IV55156.2024.10588722}}