|
1 | | -# SOMA |
2 | | -[ICCV' 23] Novel Scenes & Classes: Towards Adaptive Open-set Object Detection |
| 1 | +# [Novel Scenes & Classes: Towards Adaptive Open-set Object Detection (ICCV-23 ORAL)](assets/paper.pdf) |
| 2 | + |
| 3 | +By [Wuyang Li](https://wymancv.github.io/wuyang.github.io/) |
| 4 | + |
| 5 | +Paper link will be updated after the CVF open access. |
| 6 | + |
| 7 | +<div align=center> |
| 8 | +<img src="./assets/mot.png" width="400"> |
| 9 | +</div> |
| 10 | + |
| 11 | +Domain Adaptive Object Detection (DAOD) strongly assumes a shared class space between the two domains. |
| 12 | + |
| 13 | +This work breaks the assumption and formulates Adaptive Open-set Object Detection (AOOD), by allowing the target domain with novel-class objects. |
| 14 | + |
| 15 | +The object detector uses the base-class labels in the source domain for training, and aims to detect base-class objects and identify novel-class objects as unknown in the target domain. |
| 16 | + |
| 17 | +If you have any ideas and problems hope to discuss, you can reach me out via [E-mail](mailto:wuyangli2-c@my.cityu.edu.hk). |
| 18 | + |
| 19 | +# 💡 Preparation |
| 20 | + |
| 21 | +## Setp 1: Clone and Install the Project |
| 22 | + |
| 23 | +### Clone the repository |
| 24 | + |
| 25 | +```bash |
| 26 | +git clone https://github.com/CityU-AIM-Group/SOMA.git |
| 27 | +``` |
| 28 | + |
| 29 | +### Install the project following [Deformable DETR](https://github.com/fundamentalvision/Deformable-DETR) |
| 30 | + |
| 31 | +Note that the following is in line with our experimental environments, which is silightly different from the official one. |
| 32 | + |
| 33 | +``` |
| 34 | +# Linux, CUDA>=9.2, GCC>=5.4 |
| 35 | +# (ours) CUDA=10.2, GCC=8.4, NVIDIA V100 |
| 36 | +# Establish the conda environment |
| 37 | +
|
| 38 | +conda create -n aood python=3.7 pip |
| 39 | +conda activate aood |
| 40 | +conda install pytorch=1.5.1 torchvision=0.6.1 cudatoolkit=10.2 -c pytorch |
| 41 | +pip install -r requirements.txt |
| 42 | +
|
| 43 | +# Compile the project |
| 44 | +cd ./models/ops |
| 45 | +sh ./make.sh |
| 46 | +
|
| 47 | +# unit test (should see all checking is True) |
| 48 | +python test.py |
| 49 | +
|
| 50 | +# NOTE: If you meet the permission denied issue when starting the training |
| 51 | +cd ../../ |
| 52 | +chmod -R 777 ./ |
| 53 | +``` |
| 54 | + |
| 55 | +## Setp 2: Download Necessary Resources |
| 56 | + |
| 57 | +### Download pre-processed datasets (VOC format) from the following links |
| 58 | + |
| 59 | +| | (Foggy) Cityscapes | Pascal VOC | Clipart | BDD100K | |
| 60 | +| :------------: | :------------------------------------------------------------------------------------------------------------------------------------------: | :------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------------------------------------------------: | |
| 61 | +| Official Links | [Imgs](https://www.cityscapes-dataset.com/login/) | [Imgs+Labels](https://pjreddie.com/projects/pascal-voc-dataset-mirror/) | - | - | |
| 62 | +| Our Links | [Labels](https://portland-my.sharepoint.com/:u:/g/personal/wuyangli2-c_my_cityu_edu_hk/EVNAjK2JkG9ChREzzqdqJkYBLoZ_VOqkMdhWasN_BETGWw?e=fP9Ae4) | - | [Imgs+Labels](https://portland-my.sharepoint.com/:u:/g/personal/wuyangli2-c_my_cityu_edu_hk/Edz2YcXHuStIqwM_NA7k8FMBGLeyAGQcSjdSR-vYaVx_vw?e=es6KDW) | [Imgs+Labels](https://portland-my.sharepoint.com/:u:/g/personal/wuyangli2-c_my_cityu_edu_hk/EeiO6O36QgZKnTcUZMInACIB0dfWEg4OFyoEZnZCkibKHA?e=6byqBX) | |
| 63 | + |
| 64 | +### Download DINO-pretrained ResNet-50 from this [link](https://portland-my.sharepoint.com/:u:/g/personal/wuyangli2-c_my_cityu_edu_hk/EVnK9IPi91ZPuNmwpeSWGHABqhSFQK52I7xGzroXKeuyzA?e=EnlwgO) |
| 65 | + |
| 66 | +## Setp 3: Change the Path |
| 67 | + |
| 68 | +### Change the data path as follows. |
| 69 | + |
| 70 | +``` |
| 71 | +[DATASET_PATH] |
| 72 | +└─ Cityscapes |
| 73 | + └─ AOOD_Annotations |
| 74 | + └─ AOOD_Main |
| 75 | + └─ train_source.txt |
| 76 | + └─ train_target.txt |
| 77 | + └─ val_source.txt |
| 78 | + └─ val_target.txt |
| 79 | + └─ leftImg8bit |
| 80 | + └─ train |
| 81 | + └─ val |
| 82 | + └─ leftImg8bit_foggy |
| 83 | + └─ train |
| 84 | + └─ val |
| 85 | +└─ bdd_daytime |
| 86 | + └─ Annotations |
| 87 | + └─ ImageSets |
| 88 | + └─ JPEGImages |
| 89 | +└─ clipart |
| 90 | + └─ Annotations |
| 91 | + └─ ImageSets |
| 92 | + └─ JPEGImages |
| 93 | +└─ VOCdevkit |
| 94 | + └─ VOC2007 |
| 95 | + └─ VOC2012 |
| 96 | +``` |
| 97 | + |
| 98 | +### Change the data root folder in config files |
| 99 | + |
| 100 | +Replace the DATASET.COCO_PATH in all yaml files in [config](configs) by your data root $DATASET_PATH, e.g., Line 22 of [soma_aood_city_to_foggy_r50.yaml](configs/soma_aood_city_to_foggy_r50.yaml) |
| 101 | + |
| 102 | +### Change the path of DINO-pretrained backbone |
| 103 | + |
| 104 | +Replace the backbone loading path at Line 107 of [backbone.py](models/backbone.py). |
| 105 | + |
| 106 | +# 🔥 Start Training |
| 107 | + |
| 108 | +We use two GPUs for training with 2 source images and 2 target images as input. |
| 109 | + |
| 110 | +```bash |
| 111 | +GPUS_PER_NODE=2 |
| 112 | +./tools/run_dist_launch.sh 2 python main.py --config_file {CONFIG_FILE} --opts DATASET.AOOD_SETTING 1 |
| 113 | +``` |
| 114 | + |
| 115 | +We provide some scripts in our experiments in [run.sh](./run.sh). After "--opts", the settings will overwrite the default config file as the maskrcnn-benchmark framework. |
| 116 | + |
| 117 | +# 📦 Well-trained models |
| 118 | + |
| 119 | +Will be provided later |
| 120 | + |
| 121 | +<!-- | Source| Target| Task | mAP $_b$ | AR $_n$ | WI | AOSE | AP@75 | checkpoint | |
| 122 | +| :-----:| :-----:| :-----:| :-----:| :-----:| :-----:| :-----:| :-----:| :-----: |
| 123 | +| City |Foggy | het-sem | |
| 124 | +| City |Foggy | het-sem | |
| 125 | +| City |Foggy | het-sem | |
| 126 | +| City |Foggy | het-sem | --> |
| 127 | + |
| 128 | + |
| 129 | +# 💬 Notification |
| 130 | + |
| 131 | +- The core idea is to select informative motifs (which can be trated as the mix-up of object queries) for self-training. |
| 132 | +- You can try the DA version of [OW-DETR](https://github.com/akshitac8/OW-DETR) in this repository by setting: |
| 133 | +``` |
| 134 | +-opts AOOD.OW_DETR_ON True |
| 135 | +``` |
| 136 | +- Adopting SAM to address AOOD may be a good direction. |
| 137 | +- To visualize unknown boxes, post-processing is needed in Line736 of [PostProcess](models/motif_detr.py). |
| 138 | + |
| 139 | +# 📝 Citation |
| 140 | + |
| 141 | +If you think this work is helpful for your project, please give it a star and citation. We sincerely appreciate your acknowledgment. |
| 142 | + |
| 143 | +```BibTeX |
| 144 | +@InProceedings{li2023novel, |
| 145 | + title={Novel Scenes & Classes: Towards Adaptive Open-set Object Detection}, |
| 146 | + author={Li, Wuyang and Guo, Xiaoqing and Yuan, Yixuan}, |
| 147 | + booktitle={ICCV}, |
| 148 | + year={2023} |
| 149 | +} |
| 150 | +``` |
| 151 | + |
| 152 | +Relevant project: |
| 153 | + |
| 154 | +Exploring the similar issue for the classifictaion task. [[link]](https://openaccess.thecvf.com/content/CVPR2023/html/Li_Adjustment_and_Alignment_for_Unbiased_Open_Set_Domain_Adaptation_CVPR_2023_paper.html) |
| 155 | + |
| 156 | +```BibTeX |
| 157 | +@InProceedings{Li_2023_CVPR, |
| 158 | + author = {Li, Wuyang and Liu, Jie and Han, Bo and Yuan, Yixuan}, |
| 159 | + title = {Adjustment and Alignment for Unbiased Open Set Domain Adaptation}, |
| 160 | + booktitle = {CVPR}, |
| 161 | + year = {2023}, |
| 162 | +} |
| 163 | +``` |
| 164 | + |
| 165 | +# 🤞 Acknowledgements |
| 166 | + |
| 167 | +We greatly appreciate the tremendous effort for the following works. |
| 168 | + |
| 169 | +- This work is based on DAOD framework [AQT](https://github.com/weii41392/AQT). |
| 170 | +- Our work is highly inspired by [OW-DETR](https://github.com/akshitac8/OW-DETR) and [OpenDet](https://github.com/csuhan/opendet2). |
| 171 | +- The implementation of the basic detector is based on [Deformable DETR](https://github.com/fundamentalvision/Deformable-DETR). |
| 172 | + |
| 173 | +# 📒 Abstract |
| 174 | + |
| 175 | +Domain Adaptive Object Detection (DAOD) transfers an object detector to a novel domain free of labels. However, in the real world, besides encountering novel scenes, novel domains always contain novel-class objects de facto, which are ignored in existing research. Thus, we formulate and study a more practical setting, Adaptive Open-set Object Detection (AOOD), considering both novel scenes and classes. Directly combing off-the-shelled cross-domain and open-set approaches is sub-optimal since their low-order dependence, such as the confidence score, is insufficient for the AOOD with two dimensions of novel information. To address this, we propose a novel Structured Motif Matching (SOMA) framework for AOOD, which models the high-order relation with motifs, \ie, statistically significant subgraphs, and formulates AOOD solution as motif matching to learn with high-order patterns. In a nutshell, SOMA consists of Structure-aware Novel-class Learning (SNL) and Structure-aware Transfer Learning (STL). As for SNL, we establish an instance-oriented graph to capture the class-independent object feature hidden in different base classes. Then, a high-order metric is proposed to match the most significant motif as high-order patterns, serving for motif-guided novel-class learning. In STL, we set up a semantic-oriented graph to model the class-dependent relation across domains, and match unlabelled objects with high-order motifs to align the cross-domain distribution with structural awareness. Extensive experiments demonstrate that the proposed SOMA achieves state-of-the-art performance. |
| 176 | + |
| 177 | + |
0 commit comments