diff --git a/scripts/AutoTrainer/README.md b/scripts/AutoTrainer/README.md new file mode 100644 index 00000000..09aaf5f5 --- /dev/null +++ b/scripts/AutoTrainer/README.md @@ -0,0 +1,93 @@ +# Sphinx Training Helper +A Bash script designed to make training sphinx4 and pocketsphinx acoustic libraries faster and easier. + +# Installation +Sphinx Training Helper uses the `arecord` command during Readings Mode. Please ensure that ALSA is installed on your machine and configured properly in order to use Readings. + +If you would like to use the random transcript generator, you will need the `hxnormalize hxselect lynx` commands available. Installing the `html-xml-utils` and `lynx` packages should do the trick. (`sudo apt-get install -y html-xml-utils lynx`). + +The Sphinx CMU toolkit should be downloaded and installed on your machine. This includes: sphinxbase, pocketsphinx, and sphinx_train. + +In order to train/update the acoustic model, this script will need the following programs in the same directory: +bw, map_adapt, mllr_solve, mllr_transform, mk_s2sendump, word_align.pl + +These programs/binaries can be found where you installed sphinx_train (on Linux this should be `/usr/local/libexec/sphinxtrain`, for more information, see the tutorial on CMU Sphinx's website. Simply copy the needed executables from that directory to the same directory as the Sphinx Training Helper. +However, there is also a simple bash script (`copy-training-programs.sh`) that is included that can be used to copy the needed programs into the directory. + +Additionally, the `word_align.pl` script is needed to test the effectiveness of the acoustic model adaptation. You will need to copy it from your `sphinx_train/scripts/decode` directory. As of now the `copy-training-programs.sh` script does not copy in the `word_align.pl` script. + +# Instructions + Usage: trainer [OPTIONS] --type TYPE input_model output_model + input_model : The directory/filename of acoustic model to create the trained acoustic model off of. + output_model : The desired name of the trained acoustic model that will be created using this script. + + TYPE may be any of: + p PTM + c continuous + s semi-continuous + + + OPTIONS may be any of: + -h --help Displays this help text and exits. + -r --readings {yes|no} Enable or disable sentence reading. Disabling sentence reading means that the audio files in the working directory (as referenced by the fileids) will be used to train. + -s --sample_rate {int} Specify the sample rate for the audio files. Expand the value (ie 16kHz should be 16000). Default is 16000. + --map {yes|no} Enable or disable MAP weight updating. Supported in pocketsphinx and shpinx4. Default is yes. + --mllr {yes|no} Enable or disable MLLR weight updating. Currently only supported in pocketsphinx Default is yes. + --transcript {file} Specify the transcript file for readings. (default: arctic20.transcription) + --type TYPE Specify what TYPE of acoustic model is being trained. See above for valid identifiers. + -i --test_initial {yes|no} Specifiy whether or not to test the initial acoustic model before adaptation. This can help save time. Default is yes. + -f --fileids {file} Specify the fileids file for readings. (default: acrtic20.fileids) + -p --pocketsphinx {yes|no} Specfiy whether or not you are training the model for pocket sphinx. Specifying yes will add optimizations. Default is yes. Set to "no" if using for Sphinx4 (Java). + -d --dict Specify the path to the dictionary to use. Default is "cmudict-en-us.dict" + +# Readings Mode +The so called "Readings Mode" in this script is a simple command line interface that allows the user to read the entire transcript file line by line while recording. If readings mode is enabled, the user will be displayed a line from the transcript file that should be read aloud. When the user is done reading the line, they can press any key to stop recording. The user is then prompted to either user the recording or redo the recording. Once all lines from the transcript are read, the script will begin adapting the acoustic model. + +The purpose of Readings Mode is to make recording quick, simple, and painless for the user. No need to open up Audacity or a recording program and splice audio recordings, just read and press the enter key. + +Readings Mode is enabled by default as the `trainer.sh` script assumes that the user wants to record new audio files. To turn the Readings Mode off use the `-r` option. +Example: `./trainer.sh -r no -p yes --type c -f arctic20.fileids --transcript arctic20.trans input-model/ output-model/` + +# Transcription File and File IDs File +In this repository there is the `generate_transcription_file.sh` script that can be use to quickly create a transcription file and a fileids file. This script allows the user to enter sentences one at a time (pressing enter after each), which will append sentences to the generated transcript and fileids file. To change the name of the generated transcript and fileids files, you will need to change the value of the `outputName` variable. It's value is set within the first few lines of the script, so it should be very easy to find. Also, the fileids file contains references to audiofile names. The audio file names generated by default are `audio_0000`. To change the generated audio file name to `example_0000`, change the `audioFilePrefix` variable's value to `"example_"`. The script will always append a 4 character wide, zero padded ID number to the audio file name in the generated fileids file and in the transcript file entries. + +The CMU Sphinx website provides examples for writing transcript and file IDs files, but here are the formats anyways. + +### Transcription File +A text file containing the words that will be/are spoken in an audio file. +The words should be grouped into sentences as marked by an XML-like ` Your words go here ` tag. +Following the `` tag should be a space and a set of parenthesis with the audio file name inside (without the extension). +For example: + + hello world this is an example transcription file (audiofile_0001) + this is the second sentence in the transcription file (audiofile_0002) + we can even add a third sentence (audiofile_0003) + just remember to increment the file id in the parenthesises (audiofile_0004) + +### File IDs File +The File IDs file is simply a text file where each line contains the file name of an audio file (do not include the extension). +The file names should be listed in the same order as the transcription file. +Remember to increment any numbers identifying the audio files. +Example File IDs file: + + audiofile_0001 + audiofile_0002 + audiofile_0003 + audiofile_0004 + audiofile_0005 + +## `random_trainer.sh` +The `random_trainer.sh` script is a bit of a hack, but it is a very useful script. +The script scrapes a website that generates random sentences and uses that sentence as part of the transcript that the user reads. This allows for the user to adapt the acoustic model without manually creating a fileids file or a transcript file. The script will automatically generate the filesids file and transcript file as the user goes through the Readings Mode. +You will need an internet connection. + +The command also takes a third argument after the input and output model directories. +Example: `./random_trainer.sh -p yes -r yes -i no --type c intput-model/ output-mode/ 21` +The above example command will scrape a website for 21 random sentences and generate a transcript and fileids file using those sentences. Notice that the transcript and fileids arguments do not have to be supplied as the files are generated automatically and used immediately afterwards. By default the generated files will be `random.transcript` and `random.fileids`, but these can be changed by using the `--transcript` and `-f` options. + +### `-i` Option +By default, before running the adaptation commands, the trainer script will test the initial acoustic model. This is to show the user in the end how much accuracy has (or hasn't) increased. This is useful to monitor that adaptation is working, however it is also a very long process, and is sometimes uncessesary (sometimes you know that your initial accuracy is terrible, so why test it again?). To save some time, pass the trainer script the `-i no` option. +Example: `./trainer.sh -i no -p yes -r no --transcript arctic20.trans -f arctic20.fileids --type c input-model/ output-model/` + + +Author: Tyler Sengia (ExpandingDev, tylersengia@gmail.com) diff --git a/scripts/AutoTrainer/clean-up.sh b/scripts/AutoTrainer/clean-up.sh new file mode 100755 index 00000000..3b4b8039 --- /dev/null +++ b/scripts/AutoTrainer/clean-up.sh @@ -0,0 +1,10 @@ +#!/bin/bash +echo "This file is designed to delete un-needed files from the current directory to get a new iteration of training ready." +read -n 1 -s -p "Press any key to start..." +rm *.mfc +rm *.hyp +rm tmat_counts +rm mixw_counts +rm gauden_counts +rm mllr_matrix +echo "DONE" diff --git a/scripts/AutoTrainer/copy-training-programs.sh b/scripts/AutoTrainer/copy-training-programs.sh new file mode 100755 index 00000000..99f4684d --- /dev/null +++ b/scripts/AutoTrainer/copy-training-programs.sh @@ -0,0 +1,12 @@ +#!/bin/bash +clear +echo "Please note, this is a very hackish way of copying over the needed programs. If this program fails to find the needed files, you should really do it yourself." +echo "Press any key to continue..." +read -n 1 -s +cp /usr/local/libexec/sphinxtrain/bw ./bw +cp /usr/local/libexec/sphinxtrain/map_adapt ./map_adapt +cp /usr/local/libexec/sphinxtrain/mllr_transform ./mllr_transform +cp /usr/local/libexec/sphinxtrain/mllr_solve ./mllr_solve +cp /usr/local/libexec/sphinxtrain/mk_s2sendump ./mk_s2sendump +echo "DONE" +echo " " diff --git a/scripts/AutoTrainer/generate_transcription_file.sh b/scripts/AutoTrainer/generate_transcription_file.sh new file mode 100755 index 00000000..a87d78ff --- /dev/null +++ b/scripts/AutoTrainer/generate_transcription_file.sh @@ -0,0 +1,30 @@ +#!/bin/bash +# Author: Tyler Sengia (ExpandingDev, tylersengia@gmail.com) +# https://github.com/ExpandingDev + +#Filename of the generated transcription file. By default it is "generated.trans" Feel free to change this. +outputName="generated" + +# String that prefixes the audio file ID number. See below for example. Reccommended to match your outputName and include an _ (underscore) to separate it from the number +# blah blah blah this is a sentence (audioFilePrefix0002) +audioFilePrefix="audio_" + +echo "This is the generate_transcription_file.sh program. It is a file to help create the transcription file and fileids file to train CMU Sphinx acoustic models with." +echo "You will need to enter each transcription sentence by sentence when you are prompted." +echo "Close/exit the script (Ctrl + c) when you are done entering the transcription sentences." +echo "Output will be written to $outputName.trans and $outputName.fileids" + +#Starting number for our audio files +index=0 + +while : ; do +read -e sentence + +#This is setup to pad zeros to the number until it reaches 4 chars long (Ex: 0000 0001 0002 ... 0145) Change the %04d to %05d for 5 leading zeros +f=`printf "%04d" $index` +echo " $sentence ($audioFilePrefix$f)" >> $outputName.trans +echo "$audioFilePrefix$f" >> $outputName.fileids +index=$((index + 1)) +done + +echo "Exited" diff --git a/scripts/AutoTrainer/random_trainer.sh b/scripts/AutoTrainer/random_trainer.sh new file mode 100755 index 00000000..ce0f8f45 --- /dev/null +++ b/scripts/AutoTrainer/random_trainer.sh @@ -0,0 +1,632 @@ +#!/bin/bash +################################################################################################ +# Sphinx Acoustic Model Trainer script +# This script is to assist in training an acoustic model for +# pocketsphinx and sphinx4. Continuous or batch models may be made. +# Different training methods may be used. +# Use trainer.sh --help for more information. +# +# This script is not associated with, or created by the creators of Sphinx CMU. +# This author of this script/program/file is Tyler Sengia. +# +# Any damage, modifications, errors, side effects, etc that this script/program/file causes +# is at the liability of the user and not the author. +# By downloading/installing/running this script you agree to these terms. +# +############################################################################################### +# +# Based on the instructions from: https://cmusphinx.github.io/wiki/tutorialadapt/ +# And instructions from: https://cmusphinx.github.io/wiki/tutorialtuning/ +# + +lineCount=21 # Number of lines to be read for the readings + +#Variables and their default values +PROMPT_FOR_READINGS="yes" +DO_READINGS="yes" +DO_TESTS="yes" +DO_TEST_INITIAL="yes" +transcriptionFile="random.transcription" +fileidsFile="random.fileids" +CREATE_SENDUMP="no" +DO_MAP="yes" +DO_MLLR="yes" +SAMPLE_RATE=16000 +LANGUAGE_MODEL="en-us.lm.bin" +DICTIONARY_FILE="cmudict-en-us.dict" +POCKET_SPHINX="no" +transcriptionLine="Error retreiving random sentence!" + +confirm() { + # call with a prompt string or use a default + echo " " + read -r -p "${1:-Would you like to keep this recording? (No will start the recording over again.) [y/N]}" response + case "$response" in + [yY][eE][sS]|[yY]) + return 0 + ;; + *) + return 1 + ;; + esac +} + +doReadings() { +#clean our old transcript and fileids +rm $transcriptionFile +rm $fileidsFile + +echo "Sphinx 4 Acoustic Library Auto Trainer" +echo "--------------------------------------" +echo "INSTRUCTIONS" +echo "A series of text will be displayed. Please recite the sentences to the best of your ability." +echo "When you are finished reciting the sentence, press any key." +echo "Continue reading the sentences until you have gone through all of them." +echo "Once all the sentences have been read, please wait a few moments for the trainer to run." +echo "" +echo "There are $lineCount sentences to be read." + +for ((i=1; i < lineCount+1; i++)) +do + readSentence + clear + cd recordings + mv ../output.wav ${i}_random.wav + echo " $transcriptionLine (${i}_random)" >> $transcriptionFile + echo "${i}_random" >> $fileidsFile + cd ../ +done +mv recordings/$fileidsFile ./ +mv recordings/$transcriptionFile ./ +} + +askForReadings() { + # call with a prompt string or use a default + read -r -p "${1:-Would you like to use the current audio files? (No will start the process of making new recordings.) [y/N]} " response + case "$response" in + [yY][eE][sS]|[yY]) + echo "no" + ;; + *) + echo "yes" + ;; + esac +} + +readSentence() { + echo "Sentence $i" + echo "Press ENTER when ready...." + echo " " + read + v=$(curl --data "quantity=1&count=10" https://www.randomwordgenerator.org/Random/sentence_generator 2>/dev/null | hxnormalize -x | hxselect -i "p.result" | hxselect -i "b" | hxnormalize -x | lynx -stdin -dump) + transcriptionLine=${v:6} + echo $transcriptionLine + sleep 0.2 + + { arecord -c 1 -V mono -r $SAMPLE_RATE -f S16_LE output.wav & } 2>/dev/null + childId=$! + echo " " + read -n 1 -s -r -p "[Press any key to finish recording]" + echo " " + { kill -9 $childId && wait; } 2>/dev/null + sleep 0.5 + if confirm ; then + return 1 + else + clear + readSentence + fi +} + +doContObservationCounts() { +echo "Accumulating observation counts..." +if [ ! -f $OUTPUT_MODEL/feature_transform ]; then +./bw -hmmdir $OUTPUT_MODEL \ + -moddeffn $OUTPUT_MODEL/mdef.txt \ + -ts2cbfn .cont. \ + -feat 1s_c_d_dd \ + -cmn current \ + -agc none \ + -dictfn $DICTIONARY_FILE \ + -ctlfn $fileidsFile \ + -lsnfn $transcriptionFile \ + -accumdir . +else +echo "Found feature-transform file! Using -lda option" +./bw -hmmdir $OUTPUT_MODEL \ + -moddeffn $OUTPUT_MODEL/mdef.txt \ + -ts2cbfn .cont. \ + -feat 1s_c_d_dd \ + -cmn current \ + -agc none \ + -dictfn $DICTIONARY_FILE \ + -ctlfn $fileidsFile \ + -lsnfn $transcriptionFile \ + -accumdir . \ + -lda $OUTPUT_MODEL/feature_transform +fi + +echo " " +echo "Done accumulting observation counts." +} + +doPtmObservationCounts() { +echo "Accumulating observation counts..." +if [ ! -f $OUTPUT_MODEL/feature_transform ]; then +./bw -hmmdir $OUTPUT_MODEL \ + -moddeffn $OUTPUT_MODEL/mdef.txt \ + -ts2cbfn .ptm. \ + -feat 1s_c_d_dd \ + -svspec 0-12/13-25/26-38 \ + -cmn current \ + -agc none \ + -dictfn $DICTIONARY_FILE \ + -ctlfn $fileidsFile \ + -lsnfn $transcriptionFile \ + -accumdir . +else +echo "Found feature-transform file! Using -lda option" +./bw -hmmdir $OUTPUT_MODEL \ + -moddeffn $OUTPUT_MODEL/mdef.txt \ + -ts2cbfn .ptm. \ + -feat 1s_c_d_dd \ + -svspec 0-12/13-25/26-38 \ + -cmn current \ + -agc none \ + -dictfn $DICTIONARY_FILE \ + -ctlfn $fileidsFile \ + -lsnfn $transcriptionFile \ + -accumdir . \ + -lda $OUTPUT_MODEL/feature_transform +fi + +echo " " +echo "Done accumulting observation counts." +} + +doSemiObservationCounts() { +echo "Accumulating observation counts..." +if [ ! -f $OUTPUT_MODEL/feature_transform ]; then +./bw -hmmdir $OUTPUT_MODEL \ + -moddeffn $OUTPUT_MODEL/mdef.txt \ + -ts2cbfn .semi. \ + -feat 1s_c_d_dd \ + -svspec 0-12/13-25/26-38 \ + -cmn current \ + -agc none \ + -dictfn $DICTIONARY_FILE \ + -ctlfn $fileidsFile \ + -lsnfn $transcriptionFile \ + -accumdir . +else +echo "Found feature-transform file! Using -lda option" +./bw -hmmdir $OUTPUT_MODEL \ + -moddeffn $OUTPUT_MODEL/mdef.txt \ + -ts2cbfn .semi. \ + -feat 1s_c_d_dd \ + -svspec 0-12/13-25/26-38 \ + -cmn current \ + -agc none \ + -dictfn $DICTIONARY_FILE \ + -ctlfn $fileidsFile \ + -lsnfn $transcriptionFile \ + -accumdir . \ + -lda $OUTPUT_MODEL/feature_transform +fi + +echo " " +echo "Done accumulting observation counts." +} + +makesendump() { +echo "Creating sendump file..." +./mk_s2sendump \ + -pocketsphinx $POCKET_SPHINX \ + -moddeffn $OUTPUT_MODEL/mdef.txt \ + -mixwfn $OUTPUT_MODEL/mixture_weights \ + -sendumpfn $OUTPUT_MODEL/sendump +echo " " +echo "Done creating sendump file." +} + +doContMapUpdate() { +echo "Updating acoustic model files with MAP..." +./map_adapt \ + -moddeffn $OUTPUT_MODEL/mdef.txt \ + -ts2cbfn .cont. \ + -meanfn $OUTPUT_MODEL/means \ + -varfn $OUTPUT_MODEL/variances \ + -mixwfn $OUTPUT_MODEL/mixture_weights \ + -tmatfn $OUTPUT_MODEL/transition_matrices \ + -accumdir . \ + -mapmeanfn $OUTPUT_MODEL/means \ + -mapvarfn $OUTPUT_MODEL/variances \ + -mapmixwfn $OUTPUT_MODEL/mixture_weights \ + -maptmatfn $OUTPUT_MODEL/transition_matrices +echo " " +echo "Done updating with MAP." +} + +doPtmMapUpdate() { +echo "Updating acoustic model files with MAP..." +./map_adapt \ + -moddeffn $OUTPUT_MODEL/mdef.txt \ + -ts2cbfn .ptm. \ + -meanfn $OUTPUT_MODEL/means \ + -varfn $OUTPUT_MODEL/variances \ + -mixwfn $OUTPUT_MODEL/mixture_weights \ + -tmatfn $OUTPUT_MODEL/transition_matrices \ + -accumdir . \ + -mapmeanfn $OUTPUT_MODEL/means \ + -mapvarfn $OUTPUT_MODEL/variances \ + -mapmixwfn $OUTPUT_MODEL/mixture_weights \ + -maptmatfn $OUTPUT_MODEL/transition_matrices +echo " " +echo "Done updating with MAP." +} + +doSemiMapUpdate() { +echo "Updating acoustic model files with MAP..." +./map_adapt \ + -moddeffn $OUTPUT_MODEL/mdef.txt \ + -ts2cbfn .semi. \ + -meanfn $OUTPUT_MODEL/means \ + -varfn $OUTPUT_MODEL/variances \ + -mixwfn $OUTPUT_MODEL/mixture_weights \ + -tmatfn $OUTPUT_MODEL/transition_matrices \ + -accumdir . \ + -mapmeanfn $OUTPUT_MODEL/means \ + -mapvarfn $OUTPUT_MODEL/variances \ + -mapmixwfn $OUTPUT_MODEL/mixture_weights \ + -maptmatfn $OUTPUT_MODEL/transition_matrices +echo " " +echo "Done updating with MAP." +} + +domllrupdate() { +echo "Updating acoustic model with MLLR..." +./mllr_solve \ + -meanfn $OUTPUT_MODEL/means \ + -varfn $OUTPUT_MODEL/variances \ + -outmllrfn mllr_matrix -accumdir . +echo "" +echo "Done updating with MLLR." +} + +convertmdef() { +echo "Converting mdef into text format..." +pocketsphinx_mdef_convert -text $OUTPUT_MODEL/mdef $OUTPUT_MODEL/mdef.txt +echo " " +echo "Done converting mdef." +} + +createAcousticFeatures() { +echo " " +echo "Creating acoustic feature files..." +sphinx_fe -argfile $OUTPUT_MODEL/feat.params -samprate $SAMPLE_RATE -c $fileidsFile -di ./recordings -do . -ei wav -eo mfc -mswav yes +echo "Done creating acoustic feature files." +echo " " +} + +testInitialModel() { +echo "Testing acoustic model before adaptations..." +pocketsphinx_batch \ + -adcin yes \ + -cepdir ./recordings \ + -cepext .wav \ + -ctl $fileidsFile \ + -lm $LANGUAGE_MODEL \ + -dict $DICTIONARY_FILE \ + -hmm $OUTPUT_MODEL \ + -hyp initial_test.hyp +echo "Finished testing acoustic model before adaptations." +} + +testFinalModel() { +echo "Testing acoustic model after adaptations..." +if [ $DO_MLLR = "no" ]; then +pocketsphinx_batch \ + -adcin yes \ + -cepdir ./recordings \ + -cepext .wav \ + -ctl $fileidsFile \ + -lm $LANGUAGE_MODEL \ + -dict $DICTIONARY_FILE \ + -hmm $OUTPUT_MODEL \ + -hyp final_test.hyp +else +pocketsphinx_batch \ + -adcin yes \ + -cepdir ./recordings \ + -cepext .wav \ + -ctl $fileidsFile \ + -lm $LANGUAGE_MODEL \ + -dict $DICTIONARY_FILE \ + -hmm $OUTPUT_MODEL \ + -hyp final_test.hyp \ + -mllr $OUTPUT_MODEL/mllr_matrix +fi +echo "Finished testing acoustic model after adaptations." +} + +compareTests() { +echo "================================================================" +echo "BEFORE Adaptation:" +perl -w word_align.pl $transcriptionFile initial_test.hyp +echo "================================================================" +echo "AFTER Adaptation:" +perl -w word_align.pl $transcriptionFile final_test.hyp +echo "================================================================" +echo "" +} + +displayHelp() { +cat << EOF +Sphinx Auto Trainer Script - Randomly Generated Transcript + +Author: tylersengia@gmail.com + +Usage: trainer [OPTIONS] --type TYPE input_model output_model sentence_count + input_model : The directory/filename of acoustic model to create the trained acoustic model off of. + output_model : The desired name of the trained acoustic model that will be created using this script. + sentence_count : The desired number of randomly generated senctences that the user would like to read off. + +TYPE may be any of: + p PTM + c continuous + s semi-continuous + + +OPTIONS may be any of: + -h --help Displays this help text and exits. + -r --readings {yes|no} Enable or disable sentence reading. Disabling sentence reading means that the audio files in the working directory (as referenced by the fileids) will be used to train. + -s --sample_rate {int} Specify the sample rate for the audio files. Expand the value (ie 16kHz should be 16000). Default is 16000. + --map {yes|no} Enable or disable MAP weight updating. Supported in pocketsphinx and shpinx4. Default is yes. + --mllr {yes|no} Enable or disable MLLR weight updating. Currently only supported in pocketsphinx Default is yes. + --transcript {file} Specify the transcript file for readings. (default: arctic20.transcription) + --type TYPE Specify what TYPE of acoustic model is being trained. See above for valid identifiers. + -i --test_initial {yes|no} Specifiy whether or not to test the initial acoustic model before adaptation. This can help save time. Default is yes. + -f --fileids {file} Specify the fileids file for readings. (default: acrtic20.fileids) + -p --pocketsphinx {yes|no} Specfiy whether or not you are training the model for pocket sphinx. Specifying yes will add optimizations. Default is yes. Set to "no" if using for Sphinx4 (Java). + -d --dict Specify the path to the dictionary to use. Default is "cmudict-en-us.dict" + +Issues, questions or suggestions: https://github.com/ExpandingDev/SphinxTrainingHelper +EOF +exit 2 +} + +#Parsing arguments +while [[ $# -gt 0 ]] +do +key="$1" + +case $key in + -r|--readings) + PROMPT_FOR_READINGS="no" + DO_READINGS="$2" + shift # past argument + ;; + --map) + DO_MAP="$2" + shift + ;; + --mllr) + DO_MLLR="$2" + shift + ;; + -s|--sample-rate) + SAMPLE_RATE="$2" + shift + ;; + -h|--help) + displayHelp + ;; + -i|--test_initial) + DO_TEST_INITIAL="$2" + shift + ;; + -d|--dict) + DICTIONARY_FILE="$2" + shift + ;; + -p|--pocketsphinx) + POCKET_SPHINX="$2" + shift + ;; + --transcript) + transcriptionFile="$2" + shift + ;; + -f|--fileids) + fileidsFile="$2" + shift + ;; + --type) + inputType="$2" + shift + ;; + *) + INPUT_MODEL="$1" + OUTPUT_MODEL="$2" + lineCount="$3" + break + ;; +esac +shift # past argument or value +done + +#Sanitize the paths. Remove the ending / if there is one +OUTPUT_MODEL=${OUTPUT_MODEL%/} +INPUT_MODEL=${INPUT_MODEL%/} + +#Check to see if we have all of the required programs/files and error out if we don't +if [ ! -f bw ]; then + echo "ERROR: You are missing the 'bw' program in this directory. Please copy it into this directory from /usr/local/libexec/sphinxtrain (or wherever it is installed)." + exit 1 +fi + +if [ ! -f map_adapt ]; then + echo "ERROR: You are missing the 'map_adapt' program in this directory. Please copy it into this directory from /usr/local/libexec/sphinxtrain (or wherever it is installed)." + exit 1 +fi + +if [ ! -f mllr_solve ]; then + echo "ERROR: You are missing the 'mllr_solve' program in this directory. Please copy it into this directory from /usr/local/libexec/sphinxtrain (or wherever it is installed)." + exit 1 +fi + +if [ ! -f mllr_transform ]; then + echo "ERROR: You are missing the 'mllr_transform' program in this directory. Please copy it into this directory from /usr/local/libexec/sphinxtrain (or wherever it is installed)." + exit 1 +fi + +if [ ! -f mk_s2sendump ]; then + echo "ERROR: You are missing the 'mk_s2sendump' program in this directory. Please copy it into this directory from /usr/local/libexec/sphinxtrain (or wherever it is installed)." + exit 1 +fi + +if [ ! -f word_align.pl ]; then + echo "ERROR: You are missing the 'word_align.pl' perl script in this directory. Please copy it into this directory from sphinxtrain/scripts/decode (Extracted from the tar file. This script isn't installed, it comes straight from sphinxtrain's source files)." + exit 1 +fi + +if [ ! -f $INPUT_MODEL/mixture_weights ]; then + echo "ERROR: You are missing the 'mixture_weights' file in your input acoustic model. You may have to download the full version of the acoustic model that has the mixture_weights file present." + echo "ERROR: Missing mixture_weights, cannot continue training, stopping..." + exit 1 +fi + +#Check to see if we have all the necessary config/base files. +if [ ! -f $LANGUAGE_MODEL ]; then + echo "ERROR: The language model file ($LANGUAGE_MODEL) is missing!" + exit 1 +fi + +if [ ! -f $DICTIONARY_FILE ]; then + echo "ERROR: The dictionary ($DICTIONARY_FILE) is missing!" + exit 1 +fi + +if [ ! -d $INPUT_MODEL ]; then + echo "ERROR: Could not find the specified input/base acoustic model you specified: $INPUT_MODEL" + exit 1 +fi + +# Check to see if the user entered a correct acoustic model type +case "$inputType" in + [pP][tT][mM]|[pP]) + echo "Training a PTM acoustic model." + modelType="PTM" + ;; + cont|[cC]) + echo "Training a continuous acoustic model." + modelType="CONT" + ;; + semi|[sS]) + echo "Training a semi-continuous acoustic model." + modelType="SEMI" + ;; + *) + echo "Invalid model type supplied!!! Please include the --type argument or see --help for more details!" + exit 1 + ;; +esac + +# Test to make sure these aren't the same +if [ $OUTPUT_MODEL/feat.params -ef $INPUT_MODEL/feat.params ]; then + echo "ERROR: Input and Output model paths are the same! This is not allowed!" + exit 1 +fi + +cp -r $INPUT_MODEL $OUTPUT_MODEL + +clear + +#Ask to do the readings if the user didn't specify +if [ $PROMPT_FOR_READINGS = "yes" ]; then + DO_READINGS=$(askForReadings) +fi + +#Do the readings if the user wants to +if [ $DO_READINGS = "yes" ]; then + #Clean up out directory before we begin + rm -rf recordings + rm transcription.txt + mkdir recordings + doReadings +fi + +if [ $DO_TEST_INITIAL = "yes" ]; then + testInitialModel +fi + +createAcousticFeatures + +#Convert the mdef file to txt filetype if it does not exist +if [ ! -f $OUTPUT_MODEL/mdef.txt ]; then + convertmdef +fi + +#Do observation counts, according to what type of acoustic model is being used +case "$modelType" in + PTM) + doPtmObservationCounts + ;; + CONT) + doContObservationCounts + ;; + SEMI) + doSemiObservationCounts + ;; + *) + echo "Invalid model type supplied. This error should never ever happen! Impossible!" + exit 1 + ;; +esac + +# Copy fillerdict to noisedict if noisedict is missing +if [ ! -f $OUTPUT_MODEL/noisedict ]; then + if [ -f fillerdict ]; then + echo "Missing the 'noisedict' file, copying the 'fillerdict' file to use as noisedict" + cp fillerdict $OUTPUT_MODEL/noisedict + else + echo "WARNING: Missing the 'noisedict' file as well as the 'fillerdict' file to replace the noisedict file. Press enter to continue." + read + fi +fi + +#MLLR +if [ $DO_MLLR = "yes" ]; then + echo "IMPORTANT: The MLLR Adaptation is supported in pocketsphinx but not sphinx4 (Java). It basically creates another config file to adjust all features. If using sphinx4, you need to use MAP adaptation." + if [ $modelType = "SEMI" ]; then + echo "WARNING: The MLLR Adaptation is not very effective for semi-continuous models because they rely on mixture weights. Press enter to continue." + read + fi + domllrupdate + cp mllr_matrix $OUTPUT_MODEL/mllr_matrix +fi + +#MAP +if [ $DO_MAP = "yes" ]; then + if [ $modelType = "CONT" ]; then + echo "WARNING: The MAP Adaptation requires lots of adaptation data to work effectively on continuous models. Press enter to continue." + read + doContMapUpdate + fi + if [ $modelType = "SEMI" ]; then + doSemiMapUpdate + fi + if [ $modelType = "PTM" ]; then + doPtmMapUpdate + fi +fi +read -n 1 -s -p "Please read the above output thoroughly and then press any key to continue..." + +makesendump + +testFinalModel +echo " " +echo " " +compareTests + +echo "DONE TRAINING." diff --git a/scripts/AutoTrainer/trainer.sh b/scripts/AutoTrainer/trainer.sh new file mode 100755 index 00000000..85116576 --- /dev/null +++ b/scripts/AutoTrainer/trainer.sh @@ -0,0 +1,637 @@ +#!/bin/bash +################################################################################################ +# Sphinx Acoustic Model Trainer script +# This script is to assist in training an acoustic model for +# pocketsphinx and sphinx4. Continuous or batch models may be made. +# Different training methods may be used. +# Use trainer.sh --help for more information. +# +# This script is not associated with, or created by the creators of Sphinx CMU. +# This author of this script/program/file is Tyler Sengia. +# +# Any damage, modifications, errors, side effects, etc that this script/program/file causes +# is at the liability of the user and not the author. +# By downloading/installing/running this script you agree to these terms. +# +############################################################################################### +# +# Based on the instructions from: https://cmusphinx.github.io/wiki/tutorialadapt/ +# And instructions from: https://cmusphinx.github.io/wiki/tutorialtuning/ +# + +lineCount=21 # Number of lines to be read for the readings + +#Variables and their default values +PROMPT_FOR_READINGS="yes" +DO_READINGS="yes" +DO_TESTS="yes" +DO_TEST_INITIAL="yes" +transcriptionFile="arctic20.transcription" +fileidsFile="arctic20.fileids" +CREATE_SENDUMP="no" +DO_MAP="yes" +DO_MLLR="yes" +SAMPLE_RATE=16000 +LANGUAGE_MODEL="en-us.lm.bin" +DICTIONARY_FILE="cmudict-en-us.dict" +POCKET_SPHINX="no" + +confirm() { + # call with a prompt string or use a default + echo " " + read -r -p "${1:-Would you like to keep this recording? (No will start the recording over again.) [y/N]}" response + case "$response" in + [yY][eE][sS]|[yY]) + return 0 + ;; + *) + return 1 + ;; + esac +} + +doReadings() { +echo "Sphinx 4 Acoustic Library Auto Trainer" +echo "--------------------------------------" +echo "INSTRUCTIONS" +echo "A series of text will be displayed. Please recite the sentences to the best of your ability." +echo "When you are finished reciting the sentence, press any key." +echo "Continue reading the sentences until you have gone through all of them." +echo "Once all the sentences have been read, please wait a few moments for the trainer to run." +echo "" + +#Take the transcription file and strip all of the tags away and put it in a regular txt file to be read line by line +sed -r "s// /g; s/<\/s>/ /g; s/\(.+\)/ /g" $transcriptionFile > transcription.txt + + +#Use the word count program (wc) to count the number of lines to be read +lineCount=`wc -l < transcription.txt` +echo "There are $lineCount sentences to be read." + +for ((i=1; i < lineCount+1; i++)) +do +readSentence +clear +cd recordings +mv ../output.wav `sed -n "$i{p;q}" ../$fileidsFile`.wav +cd ../ +done +} + +askForReadings() { + # call with a prompt string or use a default + read -r -p "${1:-Would you like to use the current audio files? (No will start the process of making new recordings.) [y/N]} " response + case "$response" in + [yY][eE][sS]|[yY]) + echo "no" + ;; + *) + echo "yes" + ;; + esac +} + +readSentence() { + echo "Sentence $i" + echo "Press ENTER when ready...." + echo " " + read + sed -n "$i{p;q}" transcription.txt + sleep 0.2 + + { arecord -c 1 -V mono -r $SAMPLE_RATE -f S16_LE output.wav & } 2>/dev/null + + childId=$! + echo " " + read -n 1 -s -r -p "[Press any key to finish recording]" + echo " " + { kill -9 $childId && wait; } 2>/dev/null + sleep 0.5 + if confirm ; then + return 1 + else + clear + readSentence + fi +} + +doContObservationCounts() { +echo "Accumulating observation counts..." +if [ ! -f $OUTPUT_MODEL/feature_transform ]; then +./bw -hmmdir $OUTPUT_MODEL \ + -moddeffn $OUTPUT_MODEL/mdef.txt \ + -ts2cbfn .cont. \ + -feat 1s_c_d_dd \ + -cmn current \ + -agc none \ + -dictfn $DICTIONARY_FILE \ + -ctlfn $fileidsFile \ + -lsnfn $transcriptionFile \ + -accumdir . +else +echo "Found feature-transform file! Using -lda option" +./bw -hmmdir $OUTPUT_MODEL \ + -moddeffn $OUTPUT_MODEL/mdef.txt \ + -ts2cbfn .cont. \ + -feat 1s_c_d_dd \ + -cmn current \ + -agc none \ + -dictfn $DICTIONARY_FILE \ + -ctlfn $fileidsFile \ + -lsnfn $transcriptionFile \ + -accumdir . \ + -lda $OUTPUT_MODEL/feature_transform +fi + +echo " " +echo "Done accumulting observation counts." +} + +doPtmObservationCounts() { +echo "Accumulating observation counts..." +if [ ! -f $OUTPUT_MODEL/feature_transform ]; then +./bw -hmmdir $OUTPUT_MODEL \ + -moddeffn $OUTPUT_MODEL/mdef.txt \ + -ts2cbfn .ptm. \ + -feat 1s_c_d_dd \ + -svspec 0-12/13-25/26-38 \ + -cmn current \ + -agc none \ + -dictfn $DICTIONARY_FILE \ + -ctlfn $fileidsFile \ + -lsnfn $transcriptionFile \ + -accumdir . +else +echo "Found feature-transform file! Using -lda option" +./bw -hmmdir $OUTPUT_MODEL \ + -moddeffn $OUTPUT_MODEL/mdef.txt \ + -ts2cbfn .ptm. \ + -feat 1s_c_d_dd \ + -svspec 0-12/13-25/26-38 \ + -cmn current \ + -agc none \ + -dictfn $DICTIONARY_FILE \ + -ctlfn $fileidsFile \ + -lsnfn $transcriptionFile \ + -accumdir . \ + -lda $OUTPUT_MODEL/feature_transform +fi + +echo " " +echo "Done accumulting observation counts." +} + +doSemiObservationCounts() { +echo "Accumulating observation counts..." +if [ ! -f $OUTPUT_MODEL/feature_transform ]; then +./bw -hmmdir $OUTPUT_MODEL \ + -moddeffn $OUTPUT_MODEL/mdef.txt \ + -ts2cbfn .semi. \ + -feat 1s_c_d_dd \ + -svspec 0-12/13-25/26-38 \ + -cmn current \ + -agc none \ + -dictfn $DICTIONARY_FILE \ + -ctlfn $fileidsFile \ + -lsnfn $transcriptionFile \ + -accumdir . +else +echo "Found feature-transform file! Using -lda option" +./bw -hmmdir $OUTPUT_MODEL \ + -moddeffn $OUTPUT_MODEL/mdef.txt \ + -ts2cbfn .semi. \ + -feat 1s_c_d_dd \ + -svspec 0-12/13-25/26-38 \ + -cmn current \ + -agc none \ + -dictfn $DICTIONARY_FILE \ + -ctlfn $fileidsFile \ + -lsnfn $transcriptionFile \ + -accumdir . \ + -lda $OUTPUT_MODEL/feature_transform +fi + +echo " " +echo "Done accumulting observation counts." +} + +makesendump() { +echo "Creating sendump file..." +./mk_s2sendump \ + -pocketsphinx $POCKET_SPHINX \ + -moddeffn $OUTPUT_MODEL/mdef.txt \ + -mixwfn $OUTPUT_MODEL/mixture_weights \ + -sendumpfn $OUTPUT_MODEL/sendump +echo " " +echo "Done creating sendump file." +} + +doContMapUpdate() { +echo "Updating acoustic model files with MAP..." +./map_adapt \ + -moddeffn $OUTPUT_MODEL/mdef.txt \ + -ts2cbfn .cont. \ + -meanfn $OUTPUT_MODEL/means \ + -varfn $OUTPUT_MODEL/variances \ + -mixwfn $OUTPUT_MODEL/mixture_weights \ + -tmatfn $OUTPUT_MODEL/transition_matrices \ + -accumdir . \ + -mapmeanfn $OUTPUT_MODEL/means \ + -mapvarfn $OUTPUT_MODEL/variances \ + -mapmixwfn $OUTPUT_MODEL/mixture_weights \ + -maptmatfn $OUTPUT_MODEL/transition_matrices +echo " " +echo "Done updating with MAP." +} + +doPtmMapUpdate() { +echo "Updating acoustic model files with MAP..." +./map_adapt \ + -moddeffn $OUTPUT_MODEL/mdef.txt \ + -ts2cbfn .ptm. \ + -meanfn $OUTPUT_MODEL/means \ + -varfn $OUTPUT_MODEL/variances \ + -mixwfn $OUTPUT_MODEL/mixture_weights \ + -tmatfn $OUTPUT_MODEL/transition_matrices \ + -accumdir . \ + -mapmeanfn $OUTPUT_MODEL/means \ + -mapvarfn $OUTPUT_MODEL/variances \ + -mapmixwfn $OUTPUT_MODEL/mixture_weights \ + -maptmatfn $OUTPUT_MODEL/transition_matrices +echo " " +echo "Done updating with MAP." +} + +doSemiMapUpdate() { +echo "Updating acoustic model files with MAP..." +./map_adapt \ + -moddeffn $OUTPUT_MODEL/mdef.txt \ + -ts2cbfn .semi. \ + -meanfn $OUTPUT_MODEL/means \ + -varfn $OUTPUT_MODEL/variances \ + -mixwfn $OUTPUT_MODEL/mixture_weights \ + -tmatfn $OUTPUT_MODEL/transition_matrices \ + -accumdir . \ + -mapmeanfn $OUTPUT_MODEL/means \ + -mapvarfn $OUTPUT_MODEL/variances \ + -mapmixwfn $OUTPUT_MODEL/mixture_weights \ + -maptmatfn $OUTPUT_MODEL/transition_matrices +echo " " +echo "Done updating with MAP." +} + +domllrupdate() { +echo "Updating acoustic model with MLLR..." +./mllr_solve \ + -meanfn $OUTPUT_MODEL/means \ + -varfn $OUTPUT_MODEL/variances \ + -outmllrfn mllr_matrix -accumdir . +echo "" +echo "Done updating with MLLR." +} + +convertmdef() { +echo "Converting mdef into text format..." +pocketsphinx_mdef_convert -text $OUTPUT_MODEL/mdef $OUTPUT_MODEL/mdef.txt +echo " " +echo "Done converting mdef." +} + +createAcousticFeatures() { +echo " " +echo "Creating acoustic feature files..." +sphinx_fe -argfile $OUTPUT_MODEL/feat.params -samprate $SAMPLE_RATE -c $fileidsFile -di ./recordings -do . -ei wav -eo mfc -mswav yes +echo "Done creating acoustic feature files." +echo " " +} + +testInitialModel() { +echo "Testing acoustic model before adaptations..." +pocketsphinx_batch \ + -adcin yes \ + -cepdir ./recordings \ + -cepext .wav \ + -ctl $fileidsFile \ + -lm $LANGUAGE_MODEL \ + -dict $DICTIONARY_FILE \ + -hmm $OUTPUT_MODEL \ + -hyp initial_test.hyp +echo "Finished testing acoustic model before adaptations." +} + +testFinalModel() { +echo "Testing acoustic model after adaptations..." +if [ $DO_MLLR = "no" ]; then +pocketsphinx_batch \ + -adcin yes \ + -cepdir ./recordings \ + -cepext .wav \ + -ctl $fileidsFile \ + -lm $LANGUAGE_MODEL \ + -dict $DICTIONARY_FILE \ + -hmm $OUTPUT_MODEL \ + -hyp final_test.hyp +else +pocketsphinx_batch \ + -adcin yes \ + -cepdir ./recordings \ + -cepext .wav \ + -ctl $fileidsFile \ + -lm $LANGUAGE_MODEL \ + -dict $DICTIONARY_FILE \ + -hmm $OUTPUT_MODEL \ + -hyp final_test.hyp \ + -mllr $OUTPUT_MODEL/mllr_matrix +fi +echo "Finished testing acoustic model after adaptations." +} + +compareTests() { +echo "================================================================" +echo "BEFORE Adaptation:" +perl -w word_align.pl $transcriptionFile initial_test.hyp +echo "================================================================" +echo "AFTER Adaptation:" +perl -w word_align.pl $transcriptionFile final_test.hyp +echo "================================================================" +echo "" +} + +displayHelp() { +cat << EOF +Sphinx Auto Trainer Script + +Author: tylersengia@gmail.com + +Usage: trainer [OPTIONS] --type TYPE input_model output_model + input_model : The directory/filename of acoustic model to create the trained acoustic model off of. + output_model : The desired name of the trained acoustic model that will be created using this script. + +TYPE may be any of: + p PTM + c continuous + s semi-continuous + + +OPTIONS may be any of: + -h --help Displays this help text and exits. + -r --readings {yes|no} Enable or disable sentence reading. Disabling sentence reading means that the audio files in the working directory (as referenced by the fileids) will be used to train. + -s --sample_rate {int} Specify the sample rate for the audio files. Expand the value (ie 16kHz should be 16000). Default is 16000. + --map {yes|no} Enable or disable MAP weight updating. Supported in pocketsphinx and shpinx4. Default is yes. + --mllr {yes|no} Enable or disable MLLR weight updating. Currently only supported in pocketsphinx Default is yes. + --transcript {file} Specify the transcript file for readings. (default: arctic20.transcription) + --type TYPE Specify what TYPE of acoustic model is being trained. See above for valid identifiers. + -i --test_initial {yes|no} Specifiy whether or not to test the initial acoustic model before adaptation. This can help save time. Default is yes. + -f --fileids {file} Specify the fileids file for readings. (default: acrtic20.fileids) + -p --pocketsphinx {yes|no} Specfiy whether or not you are training the model for pocket sphinx. Specifying yes will add optimizations. Default is yes. Set to "no" if using for Sphinx4 (Java). + -d --dict Specify the path to the dictionary to use. Default is "cmudict-en-us.dict" + +Issues, questions or suggestions: https://github.com/ExpandingDev/SphinxTrainingHelper +EOF +exit 2 +} + +#Parsing arguments +while [[ $# -gt 0 ]] +do +key="$1" + +case $key in + -r|--readings) + PROMPT_FOR_READINGS="no" + DO_READINGS="$2" + shift # past argument + ;; + --map) + DO_MAP="$2" + shift + ;; + --mllr) + DO_MLLR="$2" + shift + ;; + -s|--sample-rate) + SAMPLE_RATE="$2" + shift + ;; + -h|--help) + displayHelp + ;; + -i|--test_initial) + DO_TEST_INITIAL="$2" + shift + ;; + -d|--dict) + DICTIONARY_FILE="$2" + shift + ;; + -p|--pocketsphinx) + POCKET_SPHINX="$2" + shift + ;; + --transcript) + transcriptionFile="$2" + shift + ;; + -f|--fileids) + fileidsFile="$2" + shift + ;; + --type) + inputType="$2" + shift + ;; + *) + INPUT_MODEL="$1" + OUTPUT_MODEL="$2" + break + ;; +esac +shift # past argument or value +done + +#Sanitize the paths. Remove the ending / if there is one +OUTPUT_MODEL=${OUTPUT_MODEL%/} +INPUT_MODEL=${INPUT_MODEL%/} + +#Check to see if we have all of the required programs/files and error out if we don't +if [ ! -f bw ]; then + echo "ERROR: You are missing the 'bw' program in this directory. Please copy it into this directory from /usr/local/libexec/sphinxtrain (or wherever it is installed)." + exit 1 +fi + +if [ ! -f map_adapt ]; then + echo "ERROR: You are missing the 'map_adapt' program in this directory. Please copy it into this directory from /usr/local/libexec/sphinxtrain (or wherever it is installed)." + exit 1 +fi + +if [ ! -f mllr_solve ]; then + echo "ERROR: You are missing the 'mllr_solve' program in this directory. Please copy it into this directory from /usr/local/libexec/sphinxtrain (or wherever it is installed)." + exit 1 +fi + +if [ ! -f mllr_transform ]; then + echo "ERROR: You are missing the 'mllr_transform' program in this directory. Please copy it into this directory from /usr/local/libexec/sphinxtrain (or wherever it is installed)." + exit 1 +fi + +if [ ! -f mk_s2sendump ]; then + echo "ERROR: You are missing the 'mk_s2sendump' program in this directory. Please copy it into this directory from /usr/local/libexec/sphinxtrain (or wherever it is installed)." + exit 1 +fi + +if [ ! -f word_align.pl ]; then + echo "ERROR: You are missing the 'word_align.pl' perl script in this directory. Please copy it into this directory from sphinxtrain/scripts/decode (Extracted from the tar file. This script isn't installed, it comes straight from sphinxtrain's source files)." + exit 1 +fi + +if [ ! -f $INPUT_MODEL/mixture_weights ]; then + echo "ERROR: You are missing the 'mixture_weights' file in your input acoustic model. You may have to download the full version of the acoustic model that has the mixture_weights file present." + echo "ERROR: Missing mixture_weights, cannot continue training, stopping..." + exit 1 +fi + +#Check to see if we have all the necessary config/base files. +if [ ! -f $transcriptionFile ]; then + echo "ERROR: The transcription file ($transcriptionFile) is missing!" + exit 1 +fi + +if [ ! -f $fileidsFile ]; then + echo "ERROR: The File IDs file ($fileidsFile) is missing!" + exit 1 +fi + +if [ ! -f $LANGUAGE_MODEL ]; then + echo "ERROR: The language model file ($LANGUAGE_MODEL) is missing!" + exit 1 +fi + +if [ ! -f $DICTIONARY_FILE ]; then + echo "ERROR: The dictionary ($DICTIONARY_FILE) is missing!" + exit 1 +fi + +if [ ! -d $INPUT_MODEL ]; then + echo "ERROR: Could not find the specified input/base acoustic model you specified: $INPUT_MODEL" + exit 1 +fi + +# Check to see if the user entered a correct acoustic model type +case "$inputType" in + [pP][tT][mM]|[pP]) + echo "Training a PTM acoustic model." + modelType="PTM" + ;; + cont|[cC]) + echo "Training a continuous acoustic model." + modelType="CONT" + ;; + semi|[sS]) + echo "Training a semi-continuous acoustic model." + modelType="SEMI" + ;; + *) + echo "Invalid model type supplied!!! Please include the --type argument or see --help for more details!" + exit 1 + ;; +esac + +# Test to make sure these aren't the same +if [ $OUTPUT_MODEL/feat.params -ef $INPUT_MODEL/feat.params ]; then + echo "ERROR: Input and Output model paths are the same! This is not allowed!" + exit 1 +fi + +cp -r $INPUT_MODEL $OUTPUT_MODEL + +clear + +#Ask to do the readings if the user didn't specify +if [ $PROMPT_FOR_READINGS = "yes" ]; then + DO_READINGS=$(askForReadings) +fi + +#Do the readings if the user wants to +if [ $DO_READINGS = "yes" ]; then + #Clean up out directory before we begin + rm -rf recordings + rm transcription.txt + mkdir recordings + doReadings +fi + +if [ $DO_TEST_INITIAL = "yes" ]; then + testInitialModel +fi + +createAcousticFeatures + +#Convert the mdef file to txt filetype if it does not exist +if [ ! -f $OUTPUT_MODEL/mdef.txt ]; then + convertmdef +fi + +#Do observation counts, according to what type of acoustic model is being used +case "$modelType" in + PTM) + doPtmObservationCounts + ;; + CONT) + doContObservationCounts + ;; + SEMI) + doSemiObservationCounts + ;; + *) + echo "Invalid model type supplied. This error should never ever happen! Impossible!" + exit 1 + ;; +esac + +# Copy fillerdict to noisedict if noisedict is missing +if [ ! -f $OUTPUT_MODEL/noisedict ]; then + if [ -f fillerdict ]; then + echo "Missing the 'noisedict' file, copying the 'fillerdict' file to use as noisedict" + cp fillerdict $OUTPUT_MODEL/noisedict + else + echo "WARNING: Missing the 'noisedict' file as well as the 'fillerdict' file to replace the noisedict file. Press enter to continue." + read + fi +fi + +#MLLR +if [ $DO_MLLR = "yes" ]; then + echo "IMPORTANT: The MLLR Adaptation is supported in pocketsphinx but not sphinx4 (Java). It basically creates another config file to adjust all features. If using sphinx4, you need to use MAP adaptation." + if [ $modelType = "SEMI" ]; then + echo "WARNING: The MLLR Adaptation is not very effective for semi-continuous models because they rely on mixture weights. Press enter to continue." + read + fi + domllrupdate + cp mllr_matrix $OUTPUT_MODEL/mllr_matrix +fi + +#MAP +if [ $DO_MAP = "yes" ]; then + if [ $modelType = "CONT" ]; then + echo "WARNING: The MAP Adaptation requires lots of adaptation data to work effectively on continuous models. Press enter to continue." + read + doContMapUpdate + fi + if [ $modelType = "SEMI" ]; then + doSemiMapUpdate + fi + if [ $modelType = "PTM" ]; then + doPtmMapUpdate + fi +fi +read -n 1 -s -p "Please read the above output thoroughly and then press any key to continue..." + +makesendump + +testFinalModel +echo " " +echo " " +compareTests + +echo "DONE TRAINING."