Jinyuan Liu, Xin Fan*, Zhangbo Huang, Guanyao Wu, Risheng Liu , Wei Zhong, Zhongxuan Luo,“Target-aware Dual Adversarial Learning and a Multi-scenario Multi-Modality Benchmark to Fuse Infrared and Visible for Object Detection”, IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022. (Oral)
The preview of our dataset is as follows.
-
Sensor: A synchronized system containing one binocular optical camera and one binocular infrared sensor. More details are available in the paper.
-
Main scene:
- Campus of Dalian University of Technology.
- State Tourism Holiday Resort at the Golden Stone Beach in Dalian, China.
- Main roads in Jinzhou District, Dalian, China.
-
Total number of images:
- 8400 (for fusion, detection and fused-based detection)
- 600 (independent scene for fusion)
-
Total number of image pairs:
- 4200 (for fusion, detection and fused-based detection)
- 300 (independent scene for fusion)
-
Format of images:
- [Infrared] 24-bit grayscale bitmap
- [Visible] 24-bit color bitmap
-
Image size: 1024 x 768 pixels (mostly)
-
Registration: All image pairs are registered. The visible images are calibrated by using the internal parameters of our synchronized system, and the infrared images are artificially distorted by homography matrix.
-
Labeling: 34407 labels have been manually labeled, containing 6 kinds of targets: {People, Car, Bus, Motorcycle, Lamp, Truck}. (Limited by manpower, some targets may be mismarked or missed. We would appreciate if you would point out wrong or missing labels to help us improve the dataset)
M3FD
├── Challenge
| ├── Beach
| | ├──Annotation
| | | ├── 01863.xml
| | | └── ...
| | ├──Ir
| | | ├── 01863.png
| | | └── ...
| | ├──Vis
| | | ├── 01863.png
| | | └── ...
| ├── Crossroads
| └── ...
├── Daytime
| ├── Alley
| └── ...
├── Night
| ├── Basement
| └── ...
└── Overcast
├── Atrium
└── ...
If you have any question or suggestion about the dataset, please email to Guanyao Wu or Jinyuan Liu.
- AUIF (IEEE TCSVT 2021)
- DDcGAN (IJCAI 2019)
- Densefuse (IEEE TIP 2019)
- DIDFuse (IJCAI 2020)
- FusionGAN (Information Fusion 2019)
- GANMcC (IEEE TIM 2021)
- MFEIF (IEEE TCSVT 2021)
- RFN-Nest (Information Fusion 2021)
- SDNet (IJCV 2021)
- U2Fusion (IEEE TPAMI 2020)
You can try our method online (free) in Colab.
We recommend you to use the conda management environment.
conda create -n tardal python=3.8
conda activate tardal
pip install -r requirements.txt
We offer three pre-trained models.
Name | Description |
---|---|
TarDAL | Optimized for human vision. (Default) |
TarDAL+ | Optimized for object detection. |
TarDAL++ | Optimal solution for joint human vision and detection accuracy. |
python fuse.py --src data/sample/s1 --dst runs/sample/tardal --weights weights/tardal.pt --color
python fuse.py --src data/sample/s1 --dst runs/sample/tardal+ --weights weights/tardal+.pt --color --eval
python fuse.py --src data/sample/s1 --dst runs/sample/tardal++ --weights weights/tardal++.pt --color --eval
--color
will colorize the fused images with corresponding visible color space.
If you have any question about the code, please email to Zhanbo Huang.
@inproceedings{liu2022target,
title={Target-aware Dual Adversarial Learning and a Multi-scenario Multi-Modality Benchmark to Fuse Infrared and Visible for Object Detection},
author={Liu, Jinyuan and Fan, Xin and Huang, Zhanbo and Wu, Guanyao and Liu, Risheng and Zhong, Wei and Luo, Zhongxuan},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={5802--5811},
year={2022}
}