[TNNLS 2025] TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition
This is an official PyTorch implementation of "TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition".
TransXNet
is a CNN-Transformer hybrid vision backbone that can model both global and local dynamics with a Dual Dynamic Token Mixer (D-Mixer), achieving superior performance over both CNN and Transformer-based models.
We highly suggest using our provided dependencies to ensure reproducibility:
# Environments:
cuda==11.6
python==3.8.15
# Packages:
mmcv==1.7.1
timm==0.6.12
torch==1.13.1
torchvision==0.14.1
ImageNet with the following folder structure, you can extract ImageNet by this script.
│imagenet/
├──train/
│ ├── n01440764
│ │ ├── n01440764_10026.JPEG
│ │ ├── n01440764_10027.JPEG
│ │ ├── ......
│ ├── ......
├──val/
│ ├── n01440764
│ │ ├── ILSVRC2012_val_00000293.JPEG
│ │ ├── ILSVRC2012_val_00002138.JPEG
│ │ ├── ......
│ ├── ......
Models | Input Size | FLOPs (G) | Params (M) | Top-1 Acc.(%) | Download |
---|---|---|---|---|---|
TransXNet-T | 224x224 | 1.8 | 12.8 | 81.6 | model |
TransXNet-S | 224x224 | 4.5 | 26.9 | 83.8 | model |
TransXNet-B | 224x224 | 8.3 | 48.0 | 84.6 | model |
To train TransXNet
models on ImageNet-1K with 8 gpus (single node), run:
bash scripts/train_tiny.sh # train TransXNet-T
bash scripts/train_small.sh # train TransXNet-S
bash scripts/train_base.sh # train TransXNet-B
To evaluate TransXNet
on ImageNet-1K, run:
MODEL=transxnet_t # transxnet_{t, s, b}
python3 validate.py \
/path/to/imagenet \
--model $MODEL -b 128 \
--pretrained # or --checkpoint /path/to/checkpoint
If you find this project useful for your research, please consider citing:
@article{lou2023transxnet,
title={TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition},
author={Meng Lou and Shu Zhang and Hong-Yu Zhou and Chuan Wu and Sibei Yang and Yizhou Yu},
journal={IEEE Transactions on Neural Networks and Learning Systems},
year={2025}
}
Our implementation is mainly based on the following codebases. We gratefully thank the authors for their wonderful works.
If you have any questions, please feel free to create issues or contact me at [email protected].