masakhane-io · Matsobane-no · Oct 14, 2021 · Oct 14, 2021 · Oct 14, 2021 · Oct 14, 2021
diff --git a/.ipynb_checkpoints/starter_notebook_into_English_training-checkpoint.ipynb b/.ipynb_checkpoints/starter_notebook_into_English_training-checkpoint.ipynb
diff --git a/benchmarks/nso-en/jw300-baseline/README.md b/benchmarks/nso-en/jw300-baseline/README.md
@@ -0,0 +1,40 @@
+# Sepedi to Engish
+
+Author: Matsobane Neo
+
+## Data
+
+	- The JW300 Sepedi- English.
+
+## Model
+
+	- Default Masakhane Transformer translation model.
+	- Link to google drive folder with model(https://drive.google.com/drive/folders/1-Evp0Srf3U9LaRihNg6Rm5qM81Lq9JoT?usp=sharing)
+
+## Analysis
+
+Example 1
+```sh
+	Source:Ge ma - Mexico a setlogo a belegetšwego moo a be a sa amogele bodumedi bjo bofsa , a be a tšewa e le barapedi ba diswantšho gomme a tlaišwa o šoro .
+ 	Reference: If native-born Mexicans did not embrace the new religion , they were regarded as idolaters and were severely persecuted .
+ 	Hypothesis: When Mexico was born where he had no new religion , he was taken as a pictures and suffered severe persecution .
+```
+
+Example 2
+```sh
+	Source:  Moprofesara Rudolf Schenkel , mophologi wa tlhaselo ya tšhukudu yo go boletšwego ka yena pejana o lla ka therešo ya gore motho o itirile lenaba le nnoši leo tšhukudu e nago le lona .
+ 	Reference: Professor Rudolf Schenkel , the survivor of the rhino charge described earlier , laments the fact that man has made himself the only enemy the rhino has .
+ 	Hypothesis:  Professor Rudolf Schenkel , a humble relief mentioned earlier , repeating the truth that a person is a very very river .
+```
+
+Example 3
+```sh
+	Source:  Go sa šetšwe gore o na le bana ba bakae , o se ke wa langwa go ba thušeng tseleng e išago bophelong bjo bo sa felego .
+ 	Reference:  No matter how many children you have , never give up in helping them along the path to everlasting life .
+ 	Hypothesis:Whether you have a few children , you don ’ t have to help them in a way to everlasting life .
+```
+
+
+# Results
+	- BLEU dev :  28.62 
+	- BLEU test : 34.67
diff --git a/benchmarks/nso-en/jw300-baseline/config.yaml b/benchmarks/nso-en/jw300-baseline/config.yaml
@@ -0,0 +1,85 @@
+
+name: "nsoen_reverse_transformer"
+
+data:
+    src: "nso"
+    trg: "en"
+    train: "data/nsoen/train.bpe"
+    dev:   "data/nsoen/dev.bpe"
+    test:  "data/nsoen/test.bpe"
+    level: "bpe"
+    lowercase: False
+    max_sent_length: 100
+    src_vocab: "data/nsoen/vocab.txt"
+    trg_vocab: "data/nsoen/vocab.txt"
+
+testing:
+    beam_size: 5
+    alpha: 1.0
+
+training:
+    #load_model: "/content/drive/My Drive/masakhane/nso-en-baseline/models/nsoen_transformer/1.ckpt" # if uncommented, load a pre-trained model from this checkpoint
+    random_seed: 42
+    optimizer: "adam"
+    normalization: "tokens"
+    adam_betas: [0.9, 0.999] 
+    scheduling: "plateau"           # TODO: try switching from plateau to Noam scheduling
+    patience: 5                     # For plateau: decrease learning rate by decrease_factor if validation score has not improved for this many validation rounds.
+    learning_rate_factor: 0.5       # factor for Noam scheduler (used with Transformer)
+    learning_rate_warmup: 1000      # warmup steps for Noam scheduler (used with Transformer)
+    decrease_factor: 0.7
+    loss: "crossentropy"
+    learning_rate: 0.0003
+    learning_rate_min: 0.00000001
+    weight_decay: 0.0
+    label_smoothing: 0.1
+    batch_size: 4096
+    batch_type: "token"
+    eval_batch_size: 3600
+    eval_batch_type: "token"
+    batch_multiplier: 1
+    early_stopping_metric: "ppl"
+    epochs: 2                  # TODO: Decrease for when playing around and checking of working. Around 30 is sufficient to check if its working at all
+    validation_freq: 1000          # TODO: Set to at least once per epoch.
+    logging_freq: 100
+    eval_metric: "bleu"
+    model_dir: "models/nsoen_reverse_transformer"
+    overwrite: True              # TODO: Set to True if you want to overwrite possibly existing models. 
+    shuffle: True
+    use_cuda: True
+    max_output_length: 100
+    print_valid_sents: [0, 1, 2, 3]
+    keep_last_ckpts: 3
+
+model:
+    initializer: "xavier"
+    bias_initializer: "zeros"
+    init_gain: 1.0
+    embed_initializer: "xavier"
+    embed_init_gain: 1.0
+    tied_embeddings: True
+    tied_softmax: True
+    encoder:
+        type: "transformer"
+        num_layers: 6
+        num_heads: 4             # TODO: Increase to 8 for larger data.
+        embeddings:
+            embedding_dim: 256   # TODO: Increase to 512 for larger data.
+            scale: True
+            dropout: 0.2
+        # typically ff_size = 4 x hidden_size
+        hidden_size: 256         # TODO: Increase to 512 for larger data.
+        ff_size: 1024            # TODO: Increase to 2048 for larger data.
+        dropout: 0.3
+    decoder:
+        type: "transformer"
+        num_layers: 6
+        num_heads: 4              # TODO: Increase to 8 for larger data.
+        embeddings:
+            embedding_dim: 256    # TODO: Increase to 512 for larger data.
+            scale: True
+            dropout: 0.2
+        # typically ff_size = 4 x hidden_size
+        hidden_size: 256         # TODO: Increase to 512 for larger data.
+        ff_size: 1024            # TODO: Increase to 2048 for larger data.
+        dropout: 0.3
diff --git a/benchmarks/nso-en/jw300-baseline/results.txt b/benchmarks/nso-en/jw300-baseline/results.txt
@@ -0,0 +1,3 @@
+2021-10-14 18:06:07,082 - INFO - root - Hello! This is Joey-NMT (version 1.3).
+2021-10-14 18:07:39,312 - INFO - joeynmt.prediction -  dev bleu[13a]:  28.62 [Beam search decoding with beam size = 5 and alpha = 1.0]
+2021-10-14 18:09:46,494 - INFO - joeynmt.prediction - test bleu[13a]:  34.67 [Beam search decoding with beam size = 5 and alpha = 1.0]