Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

README.md

Deep Speech [EN]

Convert speech audio to text This work is just an example demonstration from mozilla/DeepSpeech

img

Limitations

  • This demo currently works only with mono 16kHZ wav files.
  • The provided model is only for English speech. Go to the original wiki here to train other languages.

Requirements

Run pip install -r requirements.txt

This will install deepspeech, numpy and scipy, which are it's only dependencies.

How to run

Use SpeechToText class from speech_to_text.py file.

The class can be used as shown in the following example:

import sys
from speech_to_text import SpeechToText
import scipy.io.wavfile as wav

# This variable should point to models/
MODEL_PATH = "/path/to/downloaded/models"

# Remember, only WAV mono 16kHZ
AUDIO_PATH = "/path/to/wav/audio"

stt = SpeechToText(MODEL_PATH)
fs, audio = wav.read(AUDIO_PATH)

# Speech to text
text = stt.run(audio, fs)

print(text)

Model info

Models were found in it's original repo mozilla/DeepSpeech.