-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME
156 lines (104 loc) · 3.74 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
FAILFINDER
A debugger and error analysis toolbox for machine translation decoders
AUTHORS
Loic Barrault
Ondrej Bojar
Jonathan Clark
Qin Gao
Kenneth Kenneth
David Marecek
Martin Popel
Dan Zeman
STARTING POINT
We don't like the outputs of our MT systems.
QUESTIONS
1) What don't we like?
- This is hard to answer!
2) What sentences are "interesting?"
- In the context of iterative system refinement, those that changed between 2 systems
- In the context of a system vs references, it's harder to say.
3) Why are my translations bad?
- What are the limits of my training data? (OOV on source and target side and on phrase-tables)
- What better translation could my system have given? (Oracle decoding)
- Why didn't my system translate this sentence as X? (Forced decoding + error categorization)
- Where did it all go wrong? (Search path comparison)
OOV Experiments
- A very rough estimate of reachability.
$MOSES/scripts/analysis/oov.pl
Training Test
Toks Voc Toks Voc
en 75M 595k 66k 9k
cs 66M 856k 56k 15k
n-grams Out of Vocabulary 1 2 3 4
en 1.5% 13.9% 47.8% 79.1%
cs 2.2% 29.7% 69.2% 90.0%
1-grams OOV in en->cs phrase-table
en (source) 1.6% Eastronville, ..., Héma-Québec, bed-space
cs (target) 2.3% -, names, declinated nouns
=> most of the corpus actually gets to the phrase table
Reachability with Moses
- Code by Lane Schwartz; in trunk.
moses –i in.txt –constraint ref.txt –f moses.ini
- Search not quite exhaustive, bounded by:
- phrase table filtration; this is OK.
- -ttable-limit; this is OK.
- -distortion-limit; this is OK.
- -translation-option-threshold
- -max-partial-trans-opt (default 10000)
- -persistent-cache-size (default 10000)
- -max-trans-opt-per-coverage (default 50!!)
en<->cs Reachability
Distortion ->cs ->cs ->csLEM ->en
3 1.4 3.5 2.7 1.7
6 1.9 4.8 3.9 2.2
10 2.2 5.4 4.6 2.3
30 2.4 5.9 5.1 2.4
40 2.4 5.9 5.1 2.4
Parallel sents 126k 6M 126k 126k
BLEU 9.94 15.59 (16.94) 15.38
Tested on 2525 sents.
en->cs Reachability of Training Data
Distortion\TTable Limit 10 20 50 100 500
3 64.2 65.0 65.7 65.7 65.7
6 65.2 66.5 67.5 67.5 67.5
10 73.5 75.4 77.2 77.2 77.2
30 82.7 85.0 - - -
40 82.9 85.2 - - -
60 82.9 85.2 - - -
100 - 85.2 - - -
Tested on 2k of 126k training sentences.
Linguistically Promising Factored Setup
- Two alternative paths:
form -> form(+lemma+tag)
form -> lemma+tag -generate-> form
- Fights target-side sparseness.
- Can use a large monolingual corpus for generation.
- Good:
- OOV of the generation step 1.1%
- Reachability hopefully 5\% (estimated on 625 sents)
- Bad:
- BLEU 12.07±0.45 instead of 13.43±0.45
- Reachability takes 7--22 min/sent instead of 0.3 sec/sent
- MERT needed 30 iterations instead of 13
MT-OUTPUT COMPARATOR
- simple tool (written in Perl) for visualisation of differences
- between two MT-systems
- between two versions of one MT-system
- input:
- source sentences,
- reference translations,
- two MT-outputs we want to compare
- output:
- HTML file with highlighted matching sequences
- sentences are sorted according to the difference in number of n-grams matching the reference
/scripts/MToutput_comparator
http://ufal.mff.cuni.cz/~marecek/sample.html
SEARCH GRAPH VISUALISER
by Loic
$MOSES/scripts/analysis/sg2dot.pl
ERROR CATEGORIZER
Jonathan Clark
N-best list format...
SEARCH PATH ANALYZER
Jonathan Clark
Search Path format...