
Applying the output size formula: (W+K+2P)/ S + 1, the output size of the image should be 5x5 after max-pooling in the second convolution block.
Hence, when you pass it to nn.Linar after flattening in self.classifier, in_features should equal to: hidden_units*5times5 instead of hidden_units*7times7. Otherwise, we'll get an error when training.
_Output size = (W - K + 2P) / S + 1
Where:
- W = input width/height
- K = kernel size
- P = padding
- S = stride_