[CVPR 2025] OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels

This is an official PyTorch implementation of "OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels".

Introduction

In the human vision system, top-down attention plays a crucial role in perception, wherein the brain initially performs an overall but rough scene analysis to extract salient cues (i.e., overview first), followed by a finer-grained examination to make more accurate judgments (i.e., look closely next). However, recent efforts in ConvNet designs primarily focused on increasing kernel size to obtain a larger receptive field without considering this crucial biomimetic mechanism to further improve performance. To this end, we propose a novel pure ConvNet vision backbone, termed OverLoCK, which is carefully devised from both the architecture and mixer perspectives. Specifically, we introduce a biomimetic Deep-stage Decomposition Strategy (DDS) that fuses semantically meaningful context representations into middle and deep layers by providing dynamic top-down context guidance at both feature and kernel weight levels. To fully unleash the power of top-down context guidance, we further propose a novel Context-Mixing Dynamic Convolution (ContMix) that effectively models long-range dependencies while preserving inherent local inductive biases even when the input resolution increases. These properties are absent in previous convolutions. With the support from both DDS and ContMix, our OverLoCK exhibits notable performance improvement over existing methods.

Image Classification

1. Requirements

We highly suggest using our provided dependencies to ensure reproducibility:

# Environments:
cuda==12.1
python==3.10
# Packages:
torch==2.3.1
timm==0.6.12
natten==0.17.1
depthwise_conv2d_implicit_gemm==0.0.0

2. Data Preparation

Prepare ImageNet with the following folder structure, you can extract ImageNet by this script.

│imagenet/
├──train/
│  ├── n01440764
│  │   ├── n01440764_10026.JPEG
│  │   ├── n01440764_10027.JPEG
│  │   ├── ......
│  ├── ......
├──val/
│  ├── n01440764
│  │   ├── ILSVRC2012_val_00000293.JPEG
│  │   ├── ILSVRC2012_val_00002138.JPEG
│  │   ├── ......
│  ├── ......

3. Main Results on ImageNet-1K with Pretrained Models

Models	Input Size	FLOPs (G)	Params (M)	Top-1 (%)	Download
OverLoCK-XT	224x224	2.6	16	82.7	model
OverLoCK-T	224x224	5.5	33	84.2	model
OverLoCK-S	224x224	9.7	56	84.8	model
OverLoCK-B	224x224	16.7	95	85.1	model

4. Train

To train OverLoCK models on ImageNet-1K with 8 gpus (single node), run:

bash scripts/train_xt_model.sh # train OverLoCK-XT
bash scripts/train_t_model.sh  # train OverLoCK-T
bash scripts/train_s_model.sh  # train OverLoCK-S
bash scripts/train_b_model.sh  # train OverLoCK-B

5. Validation

To evaluate OverLoCK on ImageNet-1K, run:

MODEL=overlock_xt # overlock_{xt, t, s, b}
python3 validate.py \
/path/to/imagenet \
--model $MODEL -b 128 \
--pretrained # or --checkpoint /path/to/checkpoint

Object Detection and Semantic Segmentation

Code is currently being cleaned up. Please stay tuned.

Citation

If you find this project useful for your research, please consider citing:

@inproceedings{lou2025overlock,
  title={OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels},
  author={Meng Lou and Yizhou Yu},
  booktitle={IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2025}
}

Acknowledgment

Our implementation is mainly based on the following codebases. We gratefully thank the authors for their wonderful works.

timm, natten, unireplknet, mmcv mmdet, mmseg

Contact

If you have any questions, please feel free to create issues❓ or contact me 📧.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
images		images
models		models
scripts		scripts
README.md		README.md
train.py		train.py
validate.py		validate.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[CVPR 2025] OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels

Introduction

Image Classification

1. Requirements

2. Data Preparation

3. Main Results on ImageNet-1K with Pretrained Models

4. Train

5. Validation

Object Detection and Semantic Segmentation

Citation

Acknowledgment

Contact

About

Releases 1

Packages

Languages

LMMMEng/OverLoCK

Folders and files

Latest commit

History

Repository files navigation

[CVPR 2025] OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels

Introduction

Image Classification

1. Requirements

2. Data Preparation

3. Main Results on ImageNet-1K with Pretrained Models

4. Train

5. Validation

Object Detection and Semantic Segmentation

Citation

Acknowledgment

Contact

About

Topics

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages