Update README.md

sgrvinod · Feb 13, 2020 · 71dd0ca · 71dd0ca
1 parent 6e61fc6
commit 71dd0ca
Showing 1 changed file with 7 additions and 7 deletions.
diff --git a/README.md b/README.md
@@ -360,15 +360,15 @@ An important distinction to make here is that I'm still supplying the ground-tru
 
 Since I'm teacher-forcing during validation, the BLEU score measured above on the resulting captions _does not_ reflect real performance. In fact, the BLEU score is a metric designed for comparing naturally generated captions to ground-truth captions of differing length. Once batched inference is implemented, i.e. no Teacher Forcing, early-stopping with the BLEU score will be truly 'proper'.
 
- With this in mind, I used [`eval.py`](https://github.com/sgrvinod/a-PyTorch-Tutorial-to-Image-Captioning/blob/master/eval.py) to compute the correct BLEU-4 scores of this model checkpoint on the validation set _without_ Teacher Forcing, at different beam sizes –
+With this in mind, I used [`eval.py`](https://github.com/sgrvinod/a-PyTorch-Tutorial-to-Image-Captioning/blob/master/eval.py) to compute the correct BLEU-4 scores of this model checkpoint on the validation and test sets _without_ Teacher Forcing, at different beam sizes –
 
-Beam Size | Validation BLEU-4
-:---: | :---:
-1 | 29.98
-3 | 32.95
-5 | 33.17
+Beam Size | Validation BLEU-4 | Test BLEU-4 |
+:---: | :---: | :---: |
+1 | 29.98 | 30.28 |
+3 | 32.95 | 33.06 |
+5 | 33.17 | 33.29 |
 
-This is higher than the result in the paper, and could be because of how our BLEU calculators are parameterized, the fact that I used a ResNet encoder, and actually fine-tuned the encoder – even if just a little.
+The test score is higher than the result in the paper, and could be because of how our BLEU calculators are parameterized, the fact that I used a ResNet encoder, and actually fine-tuned the encoder – even if just a little.
 
 Also, remember – when fine-tuning during Transfer Learning, it's always better to use a learning rate considerably smaller than what was originally used to train the borrowed model. This is because the model is already quite optimized, and we don't want to change anything too quickly. I used `Adam()` for the Encoder as well, but with a learning rate of `1e-4`, which is a tenth of the default value for this optimizer.