Skip to content

Commit 41c11cb

Browse files
committed
initial commit
1 parent 0ff774b commit 41c11cb

60 files changed

Lines changed: 11153 additions & 2 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

README.md

Lines changed: 177 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,177 @@
1-
# SOMA
2-
[ICCV' 23] Novel Scenes & Classes: Towards Adaptive Open-set Object Detection
1+
# [Novel Scenes & Classes: Towards Adaptive Open-set Object Detection (ICCV-23 ORAL)](assets/paper.pdf)
2+
3+
By [Wuyang Li](https://wymancv.github.io/wuyang.github.io/)
4+
5+
Paper link will be updated after the CVF open access.
6+
7+
<div align=center>
8+
<img src="./assets/mot.png" width="400">
9+
</div>
10+
11+
Domain Adaptive Object Detection (DAOD) strongly assumes a shared class space between the two domains.
12+
13+
This work breaks the assumption and formulates Adaptive Open-set Object Detection (AOOD), by allowing the target domain with novel-class objects.
14+
15+
The object detector uses the base-class labels in the source domain for training, and aims to detect base-class objects and identify novel-class objects as unknown in the target domain.
16+
17+
If you have any ideas and problems hope to discuss, you can reach me out via [E-mail](mailto:wuyangli2-c@my.cityu.edu.hk).
18+
19+
# 💡 Preparation
20+
21+
## Setp 1: Clone and Install the Project
22+
23+
### Clone the repository
24+
25+
```bash
26+
git clone https://github.com/CityU-AIM-Group/SOMA.git
27+
```
28+
29+
### Install the project following [Deformable DETR](https://github.com/fundamentalvision/Deformable-DETR)
30+
31+
Note that the following is in line with our experimental environments, which is silightly different from the official one.
32+
33+
```
34+
# Linux, CUDA>=9.2, GCC>=5.4
35+
# (ours) CUDA=10.2, GCC=8.4, NVIDIA V100
36+
# Establish the conda environment
37+
38+
conda create -n aood python=3.7 pip
39+
conda activate aood
40+
conda install pytorch=1.5.1 torchvision=0.6.1 cudatoolkit=10.2 -c pytorch
41+
pip install -r requirements.txt
42+
43+
# Compile the project
44+
cd ./models/ops
45+
sh ./make.sh
46+
47+
# unit test (should see all checking is True)
48+
python test.py
49+
50+
# NOTE: If you meet the permission denied issue when starting the training
51+
cd ../../
52+
chmod -R 777 ./
53+
```
54+
55+
## Setp 2: Download Necessary Resources
56+
57+
### Download pre-processed datasets (VOC format) from the following links
58+
59+
| | (Foggy) Cityscapes | Pascal VOC | Clipart | BDD100K |
60+
| :------------: | :------------------------------------------------------------------------------------------------------------------------------------------: | :------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------------------------------------------------: |
61+
| Official Links | [Imgs](https://www.cityscapes-dataset.com/login/) | [Imgs+Labels](https://pjreddie.com/projects/pascal-voc-dataset-mirror/) | - | - |
62+
| Our Links | [Labels](https://portland-my.sharepoint.com/:u:/g/personal/wuyangli2-c_my_cityu_edu_hk/EVNAjK2JkG9ChREzzqdqJkYBLoZ_VOqkMdhWasN_BETGWw?e=fP9Ae4) | - | [Imgs+Labels](https://portland-my.sharepoint.com/:u:/g/personal/wuyangli2-c_my_cityu_edu_hk/Edz2YcXHuStIqwM_NA7k8FMBGLeyAGQcSjdSR-vYaVx_vw?e=es6KDW) | [Imgs+Labels](https://portland-my.sharepoint.com/:u:/g/personal/wuyangli2-c_my_cityu_edu_hk/EeiO6O36QgZKnTcUZMInACIB0dfWEg4OFyoEZnZCkibKHA?e=6byqBX) |
63+
64+
### Download DINO-pretrained ResNet-50 from this [link](https://portland-my.sharepoint.com/:u:/g/personal/wuyangli2-c_my_cityu_edu_hk/EVnK9IPi91ZPuNmwpeSWGHABqhSFQK52I7xGzroXKeuyzA?e=EnlwgO)
65+
66+
## Setp 3: Change the Path
67+
68+
### Change the data path as follows.
69+
70+
```
71+
[DATASET_PATH]
72+
└─ Cityscapes
73+
└─ AOOD_Annotations
74+
└─ AOOD_Main
75+
└─ train_source.txt
76+
└─ train_target.txt
77+
└─ val_source.txt
78+
└─ val_target.txt
79+
└─ leftImg8bit
80+
└─ train
81+
└─ val
82+
└─ leftImg8bit_foggy
83+
└─ train
84+
└─ val
85+
└─ bdd_daytime
86+
└─ Annotations
87+
└─ ImageSets
88+
└─ JPEGImages
89+
└─ clipart
90+
└─ Annotations
91+
└─ ImageSets
92+
└─ JPEGImages
93+
└─ VOCdevkit
94+
└─ VOC2007
95+
└─ VOC2012
96+
```
97+
98+
### Change the data root folder in config files
99+
100+
Replace the DATASET.COCO_PATH in all yaml files in [config](configs) by your data root $DATASET_PATH, e.g., Line 22 of [soma_aood_city_to_foggy_r50.yaml](configs/soma_aood_city_to_foggy_r50.yaml)
101+
102+
### Change the path of DINO-pretrained backbone
103+
104+
Replace the backbone loading path at Line 107 of [backbone.py](models/backbone.py).
105+
106+
# 🔥 Start Training
107+
108+
We use two GPUs for training with 2 source images and 2 target images as input.
109+
110+
```bash
111+
GPUS_PER_NODE=2
112+
./tools/run_dist_launch.sh 2 python main.py --config_file {CONFIG_FILE} --opts DATASET.AOOD_SETTING 1
113+
```
114+
115+
We provide some scripts in our experiments in [run.sh](./run.sh). After "--opts", the settings will overwrite the default config file as the maskrcnn-benchmark framework.
116+
117+
# 📦 Well-trained models
118+
119+
Will be provided later
120+
121+
<!-- | Source| Target| Task | mAP $_b$ | AR $_n$ | WI | AOSE | AP@75 | checkpoint |
122+
| :-----:| :-----:| :-----:| :-----:| :-----:| :-----:| :-----:| :-----:| :-----:
123+
| City |Foggy | het-sem |
124+
| City |Foggy | het-sem |
125+
| City |Foggy | het-sem |
126+
| City |Foggy | het-sem | -->
127+
128+
129+
# 💬 Notification
130+
131+
- The core idea is to select informative motifs (which can be trated as the mix-up of object queries) for self-training.
132+
- You can try the DA version of [OW-DETR](https://github.com/akshitac8/OW-DETR) in this repository by setting:
133+
```
134+
-opts AOOD.OW_DETR_ON True
135+
```
136+
- Adopting SAM to address AOOD may be a good direction.
137+
- To visualize unknown boxes, post-processing is needed in Line736 of [PostProcess](models/motif_detr.py).
138+
139+
# 📝 Citation
140+
141+
If you think this work is helpful for your project, please give it a star and citation. We sincerely appreciate your acknowledgment.
142+
143+
```BibTeX
144+
@InProceedings{li2023novel,
145+
title={Novel Scenes & Classes: Towards Adaptive Open-set Object Detection},
146+
author={Li, Wuyang and Guo, Xiaoqing and Yuan, Yixuan},
147+
booktitle={ICCV},
148+
year={2023}
149+
}
150+
```
151+
152+
Relevant project:
153+
154+
Exploring the similar issue for the classifictaion task. [[link]](https://openaccess.thecvf.com/content/CVPR2023/html/Li_Adjustment_and_Alignment_for_Unbiased_Open_Set_Domain_Adaptation_CVPR_2023_paper.html)
155+
156+
```BibTeX
157+
@InProceedings{Li_2023_CVPR,
158+
author = {Li, Wuyang and Liu, Jie and Han, Bo and Yuan, Yixuan},
159+
title = {Adjustment and Alignment for Unbiased Open Set Domain Adaptation},
160+
booktitle = {CVPR},
161+
year = {2023},
162+
}
163+
```
164+
165+
# 🤞 Acknowledgements
166+
167+
We greatly appreciate the tremendous effort for the following works.
168+
169+
- This work is based on DAOD framework [AQT](https://github.com/weii41392/AQT).
170+
- Our work is highly inspired by [OW-DETR](https://github.com/akshitac8/OW-DETR) and [OpenDet](https://github.com/csuhan/opendet2).
171+
- The implementation of the basic detector is based on [Deformable DETR](https://github.com/fundamentalvision/Deformable-DETR).
172+
173+
# 📒 Abstract
174+
175+
Domain Adaptive Object Detection (DAOD) transfers an object detector to a novel domain free of labels. However, in the real world, besides encountering novel scenes, novel domains always contain novel-class objects de facto, which are ignored in existing research. Thus, we formulate and study a more practical setting, Adaptive Open-set Object Detection (AOOD), considering both novel scenes and classes. Directly combing off-the-shelled cross-domain and open-set approaches is sub-optimal since their low-order dependence, such as the confidence score, is insufficient for the AOOD with two dimensions of novel information. To address this, we propose a novel Structured Motif Matching (SOMA) framework for AOOD, which models the high-order relation with motifs, \ie, statistically significant subgraphs, and formulates AOOD solution as motif matching to learn with high-order patterns. In a nutshell, SOMA consists of Structure-aware Novel-class Learning (SNL) and Structure-aware Transfer Learning (STL). As for SNL, we establish an instance-oriented graph to capture the class-independent object feature hidden in different base classes. Then, a high-order metric is proposed to match the most significant motif as high-order patterns, serving for motif-guided novel-class learning. In STL, we set up a semantic-oriented graph to model the class-dependent relation across domains, and match unlabelled objects with high-order motifs to align the cross-domain distribution with structural awareness. Extensive experiments demonstrate that the proposed SOMA achieves state-of-the-art performance.
176+
177+
![image](./assets/overall.png)

assets/mot.png

2.49 MB
Loading

assets/overall.png

1010 KB
Loading

assets/paper.pdf

9.45 MB
Binary file not shown.

benchmark.py

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
# ------------------------------------------------------------------------
2+
# Modified by Wei-Jie Huang
3+
# ------------------------------------------------------------------------
4+
# Deformable DETR
5+
# Copyright (c) 2020 SenseTime. All Rights Reserved.
6+
# Licensed under the Apache License, Version 2.0 [see LICENSE for details]
7+
# ------------------------------------------------------------------------
8+
9+
"""
10+
Benchmark inference speed of Deformable DETR.
11+
"""
12+
import os
13+
import time
14+
import argparse
15+
16+
import torch
17+
18+
from main import get_args_parser as get_main_args_parser
19+
from models import build_model
20+
from datasets import build_dataset
21+
from util.misc import nested_tensor_from_tensor_list
22+
23+
24+
def get_benckmark_arg_parser():
25+
parser = argparse.ArgumentParser('Benchmark inference speed of Deformable DETR.')
26+
parser.add_argument('--num_iters', type=int, default=300, help='total iters to benchmark speed')
27+
parser.add_argument('--warm_iters', type=int, default=5, help='ignore first several iters that are very slow')
28+
parser.add_argument('--batch_size', type=int, default=1, help='batch size in inference')
29+
parser.add_argument('--resume', type=str, help='load the pre-trained checkpoint')
30+
return parser
31+
32+
33+
@torch.no_grad()
34+
def measure_average_inference_time(model, inputs, num_iters=100, warm_iters=5):
35+
ts = []
36+
for iter_ in range(num_iters):
37+
torch.cuda.synchronize()
38+
t_ = time.perf_counter()
39+
model(inputs)
40+
torch.cuda.synchronize()
41+
t = time.perf_counter() - t_
42+
if iter_ >= warm_iters:
43+
ts.append(t)
44+
print(ts)
45+
return sum(ts) / len(ts)
46+
47+
48+
def benchmark():
49+
args, _ = get_benckmark_arg_parser().parse_known_args()
50+
main_args = get_main_args_parser().parse_args(_)
51+
assert args.warm_iters < args.num_iters and args.num_iters > 0 and args.warm_iters >= 0
52+
assert args.batch_size > 0
53+
assert args.resume is None or os.path.exists(args.resume)
54+
dataset = build_dataset('val', main_args)
55+
model, _, _ = build_model(main_args)
56+
model.cuda()
57+
model.eval()
58+
if args.resume is not None:
59+
ckpt = torch.load(args.resume, map_location=lambda storage, loc: storage)
60+
model.load_state_dict(ckpt['model'])
61+
inputs = nested_tensor_from_tensor_list([dataset.__getitem__(0)[0].cuda() for _ in range(args.batch_size)])
62+
t = measure_average_inference_time(model, inputs, args.num_iters, args.warm_iters)
63+
return 1.0 / t * args.batch_size
64+
65+
66+
if __name__ == '__main__':
67+
fps = benchmark()
68+
print(f'Inference Speed: {fps:.1f} FPS')
69+

0 commit comments

Comments
 (0)