Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

the summary of AlexNet and VGGNet #14

Open
maiff opened this issue Nov 21, 2018 · 0 comments
Open

the summary of AlexNet and VGGNet #14

maiff opened this issue Nov 21, 2018 · 0 comments
Labels

Comments

@maiff
Copy link
Owner

maiff commented Nov 21, 2018

the summary of AlexNet and VGGNet

AlexNet VGGNet

AlexNet

AlexNet has five CNN and three full connection

like this

using active function called RuLU(max(0,x)),which is non-saturating,but is not zero-meaning.the initialization is crucial.

using multiple GPUs by separating net like pic

using local response normalizaton

using overlapping pool

Reducing Overfitting

Date Augmentation

  1. horizontal reflections
  2. random 224*224 patch from 256 *256 (not random (256-224)^2 * 2 = 2048)
  3. altering the intensities of the TGB channels

Drpout

0.5 output set zero

Dropout roughly doubles the number of iterations required to converge.

train

using momentum of 0.9

VGGNet

VGGNet has muti-structure followed the pic

using smaller convolutional cell and more deeper than AlexNet

training image size is random sampling from the pic,which is rescaling S.

We consider two approaches for setting the training scale S. The first is to fix S, which correspondsto single-scale training (note that image content within the sampled crops can still represent multi-scale image statistics). In our experiments, we evaluated models trained at two fixed scales: S =256 (which has been widely used in the prior art (Krizhevsky et al., 2012; Zeiler & Fergus, 2013;Sermanet et al., 2014)) and S = 384. Given a ConvNet configuration, we first trained the networkusing S = 256. To speed-up training of the S = 384 network, it was initialised with the weightspre-trained with S = 256, and we used a smaller initial learning rate of 10−3.

The second approach to setting S is multi-scale training, where each training image is individuallyrescaled by randomly sampling S from a certain range [Smin,Smax] (we used Smin = 256 andSmax = 512). Since objects in images can be of different size, it is beneficial to take this into accountduring training. This can also be seen as training set augmentation by scale jittering, where a single model is trained to recognise objects over a wide range of scales. For speed reasons, we trainedmulti-scale models by fine-tuning all layers of a single-scale model with the same configuration,pre-trained with fixed S = 384.

@maiff maiff added the blog label Nov 21, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant