Skip to content

Commit

Permalink
Merge pull request #134 from RobustBench/add_models_2
Browse files Browse the repository at this point in the history
Add models from Singh2023Revisting
  • Loading branch information
fra31 authored Mar 28, 2023
2 parents a3b71ff + 1685530 commit 9585c1b
Show file tree
Hide file tree
Showing 11 changed files with 308 additions and 11 deletions.
24 changes: 15 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -418,15 +418,21 @@ In order to use the models from the Model Zoo, you can find all available model

| <sub>#</sub> | <sub>Model ID</sub> | <sub>Paper</sub> | <sub>Clean accuracy</sub> | <sub>Robust accuracy</sub> | <sub>Architecture</sub> | <sub>Venue</sub> |
|:---:|---|---|:---:|:---:|:---:|:---:|
| <sub>**1**</sub> | <sub><sup>**Debenedetti2022Light_XCiT-L12**</sup></sub> | <sub>*[A Light Recipe to Train Robust Vision Transformers](https://arxiv.org/abs/2209.07399)*</sub> | <sub>73.76%</sub> | <sub>47.60%</sub> | <sub>XCiT-L12</sub> | <sub>arXiv, Sep 2022</sub> |
| <sub>**2**</sub> | <sub><sup>**Debenedetti2022Light_XCiT-M12**</sup></sub> | <sub>*[A Light Recipe to Train Robust Vision Transformers](https://arxiv.org/abs/2209.07399)*</sub> | <sub>74.04%</sub> | <sub>45.24%</sub> | <sub>XCiT-M12</sub> | <sub>arXiv, Sep 2022</sub> |
| <sub>**3**</sub> | <sub><sup>**Debenedetti2022Light_XCiT-S12**</sup></sub> | <sub>*[A Light Recipe to Train Robust Vision Transformers](https://arxiv.org/abs/2209.07399)*</sub> | <sub>72.34%</sub> | <sub>41.78%</sub> | <sub>XCiT-S12</sub> | <sub>arXiv, Sep 2022</sub> |
| <sub>**4**</sub> | <sub><sup>**Salman2020Do_50_2**</sup></sub> | <sub>*[Do Adversarially Robust ImageNet Models Transfer Better?](https://arxiv.org/abs/2007.08489)*</sub> | <sub>68.46%</sub> | <sub>38.14%</sub> | <sub>WideResNet-50-2</sub> | <sub>NeurIPS 2020</sub> |
| <sub>**5**</sub> | <sub><sup>**Salman2020Do_R50**</sup></sub> | <sub>*[Do Adversarially Robust ImageNet Models Transfer Better?](https://arxiv.org/abs/2007.08489)*</sub> | <sub>64.02%</sub> | <sub>34.96%</sub> | <sub>ResNet-50</sub> | <sub>NeurIPS 2020</sub> |
| <sub>**6**</sub> | <sub><sup>**Engstrom2019Robustness**</sup></sub> | <sub>*[Robustness library](https://github.com/MadryLab/robustness)*</sub> | <sub>62.56%</sub> | <sub>29.22%</sub> | <sub>ResNet-50</sub> | <sub>GitHub,<br>Oct 2019</sub> |
| <sub>**7**</sub> | <sub><sup>**Wong2020Fast**</sup></sub> | <sub>*[Fast is better than free: Revisiting adversarial training](https://arxiv.org/abs/2001.03994)*</sub> | <sub>55.62%</sub> | <sub>26.24%</sub> | <sub>ResNet-50</sub> | <sub>ICLR 2020</sub> |
| <sub>**8**</sub> | <sub><sup>**Salman2020Do_R18**</sup></sub> | <sub>*[Do Adversarially Robust ImageNet Models Transfer Better?](https://arxiv.org/abs/2007.08489)*</sub> | <sub>52.92%</sub> | <sub>25.32%</sub> | <sub>ResNet-18</sub> | <sub>NeurIPS 2020</sub> |
| <sub>**9**</sub> | <sub><sup>**Standard_R50**</sup></sub> | <sub>*[Standardly trained model](https://github.com/RobustBench/robustbench/)*</sub> | <sub>76.52%</sub> | <sub>0.00%</sub> | <sub>ResNet-50</sub> | <sub>N/A</sub> |
| <sub>**1**</sub> | <sub><sup>**Singh2023Revisiting_ConvNeXt-L-ConvStem**</sup></sub> | <sub>*[Revisiting Adversarial Training for ImageNet: Architectures, Training and Generalization across Threat Models](https://arxiv.org/abs/2303.01870)*</sub> | <sub>77.00%</sub> | <sub>57.70%</sub> | <sub>ConvNeXt-L + ConvStem</sub> | <sub>arXiv, Mar 2023</sub> |
| <sub>**2**</sub> | <sub><sup>**Singh2023Revisiting_ConvNeXt-B-ConvStem**</sup></sub> | <sub>*[Revisiting Adversarial Training for ImageNet: Architectures, Training and Generalization across Threat Models](https://arxiv.org/abs/2303.01870)*</sub> | <sub>75.90%</sub> | <sub>56.14%</sub> | <sub>ConvNeXt-B + ConvStem</sub> | <sub>arXiv, Mar 2023</sub> |
| <sub>**3**</sub> | <sub><sup>**Singh2023Revisiting_ViT-B-ConvStem**</sup></sub> | <sub>*[Revisiting Adversarial Training for ImageNet: Architectures, Training and Generalization across Threat Models](https://arxiv.org/abs/2303.01870)*</sub> | <sub>76.30%</sub> | <sub>54.66%</sub> | <sub>ViT-B + ConvStem</sub> | <sub>arXiv, Mar 2023</sub> |
| <sub>**4**</sub> | <sub><sup>**Singh2023Revisiting_ConvNeXt-S-ConvStem**</sup></sub> | <sub>*[Revisiting Adversarial Training for ImageNet: Architectures, Training and Generalization across Threat Models](https://arxiv.org/abs/2303.01870)*</sub> | <sub>74.10%</sub> | <sub>52.42%</sub> | <sub>ConvNeXt-S + ConvStem</sub> | <sub>arXiv, Mar 2023</sub> |
| <sub>**5**</sub> | <sub><sup>**Singh2023Revisiting_ConvNeXt-T-ConvStem**</sup></sub> | <sub>*[Revisiting Adversarial Training for ImageNet: Architectures, Training and Generalization across Threat Models](https://arxiv.org/abs/2303.01870)*</sub> | <sub>72.72%</sub> | <sub>49.46%</sub> | <sub>ConvNeXt-T + ConvStem</sub> | <sub>arXiv, Mar 2023</sub> |
| <sub>**6**</sub> | <sub><sup>**Singh2023Revisiting_ViT-S-ConvStem**</sup></sub> | <sub>*[Revisiting Adversarial Training for ImageNet: Architectures, Training and Generalization across Threat Models](https://arxiv.org/abs/2303.01870)*</sub> | <sub>72.56%</sub> | <sub>48.08%</sub> | <sub>ViT-S + ConvStem</sub> | <sub>arXiv, Mar 2023</sub> |
| <sub>**7**</sub> | <sub><sup>**Debenedetti2022Light_XCiT-L12**</sup></sub> | <sub>*[A Light Recipe to Train Robust Vision Transformers](https://arxiv.org/abs/2209.07399)*</sub> | <sub>73.76%</sub> | <sub>47.60%</sub> | <sub>XCiT-L12</sub> | <sub>arXiv, Sep 2022</sub> |
| <sub>**8**</sub> | <sub><sup>**Debenedetti2022Light_XCiT-M12**</sup></sub> | <sub>*[A Light Recipe to Train Robust Vision Transformers](https://arxiv.org/abs/2209.07399)*</sub> | <sub>74.04%</sub> | <sub>45.24%</sub> | <sub>XCiT-M12</sub> | <sub>arXiv, Sep 2022</sub> |
| <sub>**9**</sub> | <sub><sup>**Debenedetti2022Light_XCiT-S12**</sup></sub> | <sub>*[A Light Recipe to Train Robust Vision Transformers](https://arxiv.org/abs/2209.07399)*</sub> | <sub>72.34%</sub> | <sub>41.78%</sub> | <sub>XCiT-S12</sub> | <sub>arXiv, Sep 2022</sub> |
| <sub>**10**</sub> | <sub><sup>**Salman2020Do_50_2**</sup></sub> | <sub>*[Do Adversarially Robust ImageNet Models Transfer Better?](https://arxiv.org/abs/2007.08489)*</sub> | <sub>68.46%</sub> | <sub>38.14%</sub> | <sub>WideResNet-50-2</sub> | <sub>NeurIPS 2020</sub> |
| <sub>**11**</sub> | <sub><sup>**Salman2020Do_R50**</sup></sub> | <sub>*[Do Adversarially Robust ImageNet Models Transfer Better?](https://arxiv.org/abs/2007.08489)*</sub> | <sub>64.02%</sub> | <sub>34.96%</sub> | <sub>ResNet-50</sub> | <sub>NeurIPS 2020</sub> |
| <sub>**12**</sub> | <sub><sup>**Engstrom2019Robustness**</sup></sub> | <sub>*[Robustness library](https://github.com/MadryLab/robustness)*</sub> | <sub>62.56%</sub> | <sub>29.22%</sub> | <sub>ResNet-50</sub> | <sub>GitHub,<br>Oct 2019</sub> |
| <sub>**13**</sub> | <sub><sup>**Wong2020Fast**</sup></sub> | <sub>*[Fast is better than free: Revisiting adversarial training](https://arxiv.org/abs/2001.03994)*</sub> | <sub>55.62%</sub> | <sub>26.24%</sub> | <sub>ResNet-50</sub> | <sub>ICLR 2020</sub> |
| <sub>**14**</sub> | <sub><sup>**Salman2020Do_R18**</sup></sub> | <sub>*[Do Adversarially Robust ImageNet Models Transfer Better?](https://arxiv.org/abs/2007.08489)*</sub> | <sub>52.92%</sub> | <sub>25.32%</sub> | <sub>ResNet-18</sub> | <sub>NeurIPS 2020</sub> |
| <sub>**15**</sub> | <sub><sup>**Standard_R50**</sup></sub> | <sub>*[Standardly trained model](https://github.com/RobustBench/robustbench/)*</sub> | <sub>76.52%</sub> | <sub>0.00%</sub> | <sub>ResNet-50</sub> | <sub>N/A</sub> |

#### Corruptions (ImageNet-C & ImageNet-3DCC)

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{
"link": "https://arxiv.org/abs/2303.01870",
"name": "Revisiting Adversarial Training for ImageNet: Architectures, Training and Generalization across Threat Models",
"authors": "Naman D Singh, Francesco Croce, Matthias Hein",
"additional_data": false,
"number_forward_passes": 1,
"dataset": "imagenet",
"venue": "arXiv, Mar 2023",
"architecture": "ConvNeXt-B + ConvStem",
"eps": "4/255",
"clean_acc": "75.90",
"reported": "56.14",
"autoattack_acc": "56.14",
"unreliable": false
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{
"link": "https://arxiv.org/abs/2303.01870",
"name": "Revisiting Adversarial Training for ImageNet: Architectures, Training and Generalization across Threat Models",
"authors": "Naman D Singh, Francesco Croce, Matthias Hein",
"additional_data": false,
"number_forward_passes": 1,
"dataset": "imagenet",
"venue": "arXiv, Mar 2023",
"architecture": "ConvNeXt-L + ConvStem",
"eps": "4/255",
"clean_acc": "77.00",
"reported": "57.70",
"autoattack_acc": "57.70",
"unreliable": false
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{
"link": "https://arxiv.org/abs/2303.01870",
"name": "Revisiting Adversarial Training for ImageNet: Architectures, Training and Generalization across Threat Models",
"authors": "Naman D Singh, Francesco Croce, Matthias Hein",
"additional_data": false,
"number_forward_passes": 1,
"dataset": "imagenet",
"venue": "arXiv, Mar 2023",
"architecture": "ConvNeXt-S + ConvStem",
"eps": "4/255",
"clean_acc": "74.10",
"reported": "52.42",
"autoattack_acc": "52.42",
"unreliable": false
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{
"link": "https://arxiv.org/abs/2303.01870",
"name": "Revisiting Adversarial Training for ImageNet: Architectures, Training and Generalization across Threat Models",
"authors": "Naman D Singh, Francesco Croce, Matthias Hein",
"additional_data": false,
"number_forward_passes": 1,
"dataset": "imagenet",
"venue": "arXiv, Mar 2023",
"architecture": "ConvNeXt-T + ConvStem",
"eps": "4/255",
"clean_acc": "72.72",
"reported": "49.46",
"autoattack_acc": "49.46",
"unreliable": false
}
15 changes: 15 additions & 0 deletions model_info/imagenet/Linf/Singh2023Revisiting_ViT-B-ConvStem.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{
"link": "https://arxiv.org/abs/2303.01870",
"name": "Revisiting Adversarial Training for ImageNet: Architectures, Training and Generalization across Threat Models",
"authors": "Naman D Singh, Francesco Croce, Matthias Hein",
"additional_data": false,
"number_forward_passes": 1,
"dataset": "imagenet",
"venue": "arXiv, Mar 2023",
"architecture": "ViT-B + ConvStem",
"eps": "4/255",
"clean_acc": "76.30",
"reported": "54.66",
"autoattack_acc": "54.66",
"unreliable": false
}
15 changes: 15 additions & 0 deletions model_info/imagenet/Linf/Singh2023Revisiting_ViT-S-ConvStem.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{
"link": "https://arxiv.org/abs/2303.01870",
"name": "Revisiting Adversarial Training for ImageNet: Architectures, Training and Generalization across Threat Models",
"authors": "Naman D Singh, Francesco Croce, Matthias Hein",
"additional_data": false,
"number_forward_passes": 1,
"dataset": "imagenet",
"venue": "arXiv, Mar 2023",
"architecture": "ViT-S + ConvStem",
"eps": "4/255",
"clean_acc": "72.56",
"reported": "48.08",
"autoattack_acc": "48.08",
"unreliable": false
}
8 changes: 8 additions & 0 deletions robustbench/data.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,14 @@
transforms.ToTensor()]),
None:
transforms.Compose([transforms.ToTensor()]),
'BicubicRes256Crop224':
transforms.Compose([
transforms.Resize(
256,
interpolation=transforms.InterpolationMode("bicubic")),
transforms.CenterCrop(224),
transforms.ToTensor()
])
}


Expand Down
155 changes: 155 additions & 0 deletions robustbench/model_zoo/architectures/convstem_models.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,155 @@
"""Definition of ConvStem models as in https://arxiv.org/abs/2303.01870."""

import torch
import torch.nn as nn
from collections import OrderedDict
from typing import Tuple
from torch import Tensor
import torch.nn as nn

import timm
from timm.models import create_model
import torch.nn.functional as F
import math

from robustbench.model_zoo.architectures.utils_architectures import normalize_model


IMAGENET_MEAN = [c * 1. for c in (0.485, 0.456, 0.406)]
IMAGENET_STD = [c * 1. for c in (0.229, 0.224, 0.225)]


class LayerNorm(nn.Module):
r""" LayerNorm that supports two data formats: channels_last (default) or channels_first.
The ordering of the dimensions in the inputs. channels_last corresponds to inputs with
shape (batch_size, height, width, channels) while channels_first corresponds to inputs
with shape (batch_size, channels, height, width).
From https://github.com/facebookresearch/ConvNeXt/blob/main/models/convnext.py.
"""
def __init__(self, normalized_shape, eps=1e-6, data_format="channels_first"):
super().__init__()
self.weight = nn.Parameter(torch.ones(normalized_shape))
self.bias = nn.Parameter(torch.zeros(normalized_shape))
self.eps = eps
self.data_format = data_format
if self.data_format not in ["channels_last", "channels_first"]:
raise NotImplementedError
self.normalized_shape = (normalized_shape, )

def forward(self, x):
if self.data_format == "channels_last":
return F.layer_norm(x, self.normalized_shape, self.weight, self.bias, self.eps)
elif self.data_format == "channels_first":
u = x.mean(1, keepdim=True)
s = (x - u).pow(2).mean(1, keepdim=True)
x = (x - u) / torch.sqrt(s + self.eps)
x = self.weight[:, None, None] * x + self.bias[:, None, None]
return x


class ConvBlock(nn.Module):
expansion = 1
def __init__(self, siz=48, end_siz=8, fin_dim=384):
super(ConvBlock, self).__init__()
self.planes = siz
fin_dim = self.planes * end_siz if fin_dim != 432 else 432
# self.bn = nn.BatchNorm2d(planes) if self.normaliz == "bn" else nn.GroupNorm(num_groups=1, num_channels=planes)
self.stem = nn.Sequential(nn.Conv2d(3, self.planes, kernel_size=3, stride=2, padding=1),
LayerNorm(self.planes, data_format="channels_first"),
nn.GELU(),
nn.Conv2d(self.planes, self.planes*2, kernel_size=3, stride=2, padding=1),
LayerNorm(self.planes*2, data_format="channels_first"),
nn.GELU(),
nn.Conv2d(self.planes*2, self.planes*4, kernel_size=3, stride=2, padding=1),
LayerNorm(self.planes*4, data_format="channels_first"),
nn.GELU(),
nn.Conv2d(self.planes*4, self.planes*8, kernel_size=3, stride=2, padding=1),
LayerNorm(self.planes*8, data_format="channels_first"),
nn.GELU(),
nn.Conv2d(self.planes*8, fin_dim, kernel_size=1, stride=1, padding=0)
)
def forward(self, x):
out = self.stem(x)
# out = self.bn(out)
return out


class ConvBlock3(nn.Module):
# expansion = 1
def __init__(self, siz=64):
super(ConvBlock3, self).__init__()
self.planes = siz

self.stem = nn.Sequential(nn.Conv2d(3, self.planes, kernel_size=3, stride=2, padding=1),
LayerNorm(self.planes, data_format="channels_first"),
nn.GELU(),
nn.Conv2d(self.planes, int(self.planes*1.5), kernel_size=3, stride=2, padding=1),
LayerNorm(int(self.planes*1.5), data_format="channels_first"),
nn.GELU(),
nn.Conv2d(int(self.planes*1.5), self.planes*2, kernel_size=3, stride=1, padding=1),
LayerNorm(self.planes*2, data_format="channels_first"),
nn.GELU()
)

def forward(self, x):
out = self.stem(x)
# out = self.bn(out)
return out


class ConvBlock1(nn.Module):
def __init__(self, siz=48, end_siz=8, fin_dim=384):
super(ConvBlock1, self).__init__()
self.planes = siz

fin_dim = self.planes*end_siz if fin_dim == None else 432
self.stem = nn.Sequential(nn.Conv2d(3, self.planes, kernel_size=3, stride=2, padding=1),
LayerNorm(self.planes, data_format="channels_first"),
nn.GELU(),
nn.Conv2d(self.planes, self.planes*2, kernel_size=3, stride=2, padding=1),
LayerNorm(self.planes*2, data_format="channels_first"),
nn.GELU()
)

def forward(self, x):
out = self.stem(x)
# out = self.bn(out)
return out


def get_convstem_models(modelname, pretrained=False):
"""Initialize models with ConvStem."""

if modelname == 'convnext_t_cvst':
model = timm.models.convnext.convnext_tiny(pretrained=pretrained)
model.stem = ConvBlock1(48, end_siz=8)

elif modelname == "convnext_s_cvst":
model = timm.models.convnext.convnext_small(pretrained=pretrained)
model.stem = ConvBlock1(48, end_siz=8)

elif modelname == "convnext_b_cvst":
model_args = dict(depths=[3, 3, 27, 3], dims=[128, 256, 512, 1024])
model = timm.models.convnext._create_convnext(
'convnext_base.fb_in1k', pretrained=pretrained, **model_args)
model.stem = ConvBlock3(64)

elif modelname == "convnext_l_cvst":
model = timm.models.convnext_large(pretrained=pretrained)
model.stem = ConvBlock3(96)

elif modelname == 'vit_s_cvst':
model = create_model('deit_small_patch16_224', pretrained=pretrained)
model.patch_embed.proj = ConvBlock(48, end_siz=8)
model = normalize_model(model, IMAGENET_MEAN, IMAGENET_STD)

elif modelname == 'vit_b_cvst':
model = timm.models.vision_transformer.vit_base_patch16_224(pretrained=pretrained)
model.patch_embed.proj = ConvBlock(48, end_siz=16, fin_dim=None)

else:
raise ValueError(f'Invalid model name: {modelname}.')

return model

31 changes: 31 additions & 0 deletions robustbench/model_zoo/imagenet.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
from robustbench.model_zoo.enums import ThreatModel
from robustbench.model_zoo.architectures.utils_architectures import normalize_model
from robustbench.model_zoo.architectures import xcit
from robustbench.model_zoo.architectures.convstem_models import get_convstem_models


mu = (0.485, 0.456, 0.406)
Expand Down Expand Up @@ -62,6 +63,36 @@
'gdrive_id':
None
}),
('Singh2023Revisiting_ViT-S-ConvStem', {
'model': lambda: get_convstem_models('vit_s_cvst'),
'gdrive_id': '1-1sUYXnj6bDXacIKI3KKqn4rlkmL-ZI2',
'preprocessing': 'BicubicRes256Crop224'
}),
('Singh2023Revisiting_ViT-B-ConvStem', {
'model': lambda: get_convstem_models('vit_b_cvst'),
'gdrive_id': '1-JBbfi_eH3tKMXObvPPHprrZae0RiQGT',
'preprocessing': 'BicubicRes256Crop224'
}),
('Singh2023Revisiting_ConvNeXt-T-ConvStem', {
'model': lambda: get_convstem_models('convnext_t_cvst'),
'gdrive_id': '1-FjtOF6LJ3-bf4VezsmWwncCxYSx-USP',
'preprocessing': 'BicubicRes256Crop224'
}),
('Singh2023Revisiting_ConvNeXt-S-ConvStem', {
'model': lambda: get_convstem_models('convnext_s_cvst'),
'gdrive_id': '1-ZrMYajCCnrtV4oT0wa3qJJoQy1nUSnL',
'preprocessing': 'BicubicRes256Crop224'
}),
('Singh2023Revisiting_ConvNeXt-B-ConvStem', {
'model': lambda: get_convstem_models('convnext_b_cvst'),
'gdrive_id': '1-lE-waaVvfL7lgBrydmZIM9UJimmHnVe',
'preprocessing': 'BicubicRes256Crop224'
}),
('Singh2023Revisiting_ConvNeXt-L-ConvStem', {
'model': lambda: get_convstem_models('convnext_l_cvst'),
'gdrive_id': '10-YOVdM2EQjHemSi9x2H44qKRSOXVQmh',
'preprocessing': 'BicubicRes256Crop224'
}),
])

common_corruptions = OrderedDict(
Expand Down
Loading

0 comments on commit 9585c1b

Please sign in to comment.