Convert speech audio to text This work is just an example demonstration from mozilla/DeepSpeech
- This demo currently works only with mono 16kHZ
wavfiles. - The provided model is only for English speech. Go to the original wiki here to train other languages.
Run pip install -r requirements.txt
This will install deepspeech, numpy and scipy, which are it's only dependencies.
Use SpeechToText class from speech_to_text.py file.
The class can be used as shown in the following example:
import sys
from speech_to_text import SpeechToText
import scipy.io.wavfile as wav
# This variable should point to models/
MODEL_PATH = "/path/to/downloaded/models"
# Remember, only WAV mono 16kHZ
AUDIO_PATH = "/path/to/wav/audio"
stt = SpeechToText(MODEL_PATH)
fs, audio = wav.read(AUDIO_PATH)
# Speech to text
text = stt.run(audio, fs)
print(text)Models were found in it's original repo mozilla/DeepSpeech.