Strategy B3 #12

june6423 · 2024-08-06T05:51:18Z

Greetings!

I read your paper with great interest and am trying to reproduce some of your experiments.

I want to reproduce your vanilla KD setting using strategy B1, B2, B3 based on your DIST_KD paper.

I found B1 and B2 strategy on your strategies folder, but I couldn't find B3 setting.

configs/strategies/deit/deit_tiny.yaml appears to be B3, but I'm not sure, which leaves me with a question.

Could you give me B3 setting with vanilla KD with temperature 4?

The text was updated successfully, but these errors were encountered:

hunto · 2024-08-06T07:51:41Z

Hi @june6423 ,

Our B3 experiment on Swin Transformer was implemented on the original training code of Swin-Transformer, so there's no b3 config in this repo.

Alternatively, if you want to implement B3 on this repo, the strategy is similar to deit_tiny, you can use the following config for KD (T=4):

aa: rand-m9-mstd0.5
batch_size: 128 # x 8 gpus = 1024bs
color_jitter: 0.4
decay_by_epoch: false
decay_epochs: 3
decay_rate: 0.967
# dropout
drop: 0.0
drop_path_rate: 0.2

epochs: 300
log_interval: 50
lr: 1.e-3
min_lr: 5.0e-06
model_ema: False
model_ema_decay: 0.999
momentum: 0.9
opt: adamw
opt_betas: null
opt_eps: 1.0e-08
clip_grad_norm: true
clip_grad_max_norm: 5.0

interpolation: 'bicubic'

# random erase
remode: pixel
reprob: 0.25

# mixup
mixup: 0.8
cutmix: 1.0
mixup_prob: 1.0
mixup_switch_prob: 0.5
mixup_mode: 'batch'

sched: cosine
seed: 42
warmup_epochs: 20
warmup_lr: 5.e-7
weight_decay: 0.04
workers: 16

# kd
kd: 'kd'
ori_loss_weight: 1.
kd_loss_weight: 1.
teacher_model: 'timm_swin_large_patch4_window7_224'
teacher_pretrained: True

june6423 · 2024-08-07T04:00:22Z

Thanks a lot!

Now I want to reproduce results of other KD methods including RKD and CRD (I am working on your Table5 in DIST_KD paper, CIFAR 100)

But I failed to find training config and code for training from scratch and other KD methods.

I am working on image_classification_sota with d9662f7 version.

I am wondering if there is already published code to experiment with these settings, or if I should implement them myself.

Thanks for your effort.

Malaika68 · 2025-01-29T18:23:18Z

Hi @june6423 , how did you manage to get the data from meta folder for ImageNet?

june6423 · 2025-02-03T01:36:29Z

Hi @june6423 , how did you manage to get the data from meta folder for ImageNet?

I made it meta data file.

Make train.txt and val.txt in data/imagenet/meta folder.

Here's the example of train.txt. (File path and class number)

image/n01440764/n01440764_10026.JPEG 0
image/n01440764/n01440764_10027.JPEG 0

Malaika68 · 2025-02-04T12:13:03Z

Hi @june6423, thanks for help. I did that but my validation accuracy is 0. Did you also face this thing? I am trying to distill the knowledge from resnet34 to resnet18.

june6423 closed this as completed Aug 7, 2024

june6423 reopened this Aug 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Strategy B3 #12

Strategy B3 #12

june6423 commented Aug 6, 2024

hunto commented Aug 6, 2024

june6423 commented Aug 7, 2024 •

edited

Loading

Malaika68 commented Jan 29, 2025

june6423 commented Feb 3, 2025 •

edited

Loading

Malaika68 commented Feb 4, 2025 •

edited

Loading

Strategy B3 #12

Strategy B3 #12

Comments

june6423 commented Aug 6, 2024

hunto commented Aug 6, 2024

june6423 commented Aug 7, 2024 • edited Loading

Malaika68 commented Jan 29, 2025

june6423 commented Feb 3, 2025 • edited Loading

Malaika68 commented Feb 4, 2025 • edited Loading

june6423 commented Aug 7, 2024 •

edited

Loading

june6423 commented Feb 3, 2025 •

edited

Loading

Malaika68 commented Feb 4, 2025 •

edited

Loading