Name	Name	Last commit message	Last commit date
parent directory ..
nodejs	nodejs
John_F_Kennedy_Inaugural_Speech_January_20_1961.mp3	John_F_Kennedy_Inaugural_Speech_January_20_1961.mp3
README.md	README.md
format_audio.py	format_audio.py
install.sh	install.sh
transcribe_audio.py	transcribe_audio.py
vm-setup.sh	vm-setup.sh

README

Project Summary

In this tutorial, we'll learn how to use the Google Cloud Speech API to transcribe an audio file. The trickiest part of this API is converting your audio data into the correct format, which we'll do using FFmpeg.

Data

YouTube Audio Library has a number of public domain audio files. The audio file transcribed in this tutorial (John_F_Kennedy_Inaugural_Speech_January_20_1961.mp3) was downloaded from this library.

Data Size Limitations

Audio longer than 1 minute must reside on Google Cloud Storage (GCS) and audio up to 80 minutes duration can be processed at a time (usage limit).

Since most audio files would be longer than 1 minute, we'll skip the part where files could be transcribed locally on your laptop and instead we'll learn how to transcribe files on GCS.

Requirements

I. Google Cloud Platform (GCP) credentials

If you haven't already, you may sign-up for the free GCP trial credit

Set up your project on GCP

II. GCS bucket

Create a GCS bucket
Upload the JSON file for the service account key that you just created in the step above
Clone this repo on the cloud shell:

$ git clone https://github.com/dsnair/GCP.git
$ cd speech
$ ls

III. VM

On the cloud shell, run the vm-setup.sh script to create an Ubuntu-based VM instance named instance-1:

$ chmod +x vm-setup.sh
$ ./vm-setup.sh

Do it manually on the Console:

Create a VM instance
- Select Ubuntu under 'Boot disk' for all the above installation commands to work
- 'Allow full access to all Cloud APIs' under 'Access scopes'
Connect to your VM instance

IV. Install packages

Clone this repo on your VM (different from the cloud shell)
On your VM, run the install.sh script to install FFmpeg, the Speech API client for Python, and gcsfuse:

$ chmod +x install.sh
$ ./install.sh

V. Mount local directory

gcsfuse mounts a directory on your VM to a bucket on GCS. This allows the two directories on different machines to see each others content and be in sync when the directory contents change.

Now, let's mount a local directory named local_bucket on your VM to your GCS bucket.

$ mkdir local_bucket
$ gcsfuse your-bucket-name local_bucket
$ cd local_bucket/
$ ls

Set the environment variable to point to the service account key:

$ export GOOGLE_APPLICATION_CREDENTIALS=path_to/service_account_file.json

Transcribe Audio file

$ python transcribe_audio.py gs://your-bucket-name/John_F_Kennedy_Inaugural_Speech_January_20_1961.mp3

Output: transcribe_audio.py formats the audio file to create .WAV file and outputs the audio transcription on your shell.

Format Audio file

Here are the details on how the audio file is formatted using FFmpeg:

$ ffmpeg -i John_F_Kennedy_Inaugural_Speech_January_20_1961.mp3 -acodec pcm_s16le -ac 1 -f segment -segment_time 4800 John_F_Kennedy_Inaugural_Speech_January_20_1961_%d.wav

Output: The command line above creates John_F_Kennedy_Inaugural_Speech_January_20_1961_0.wav in local_bucket, which is also visible in your-bucket-name.

-i takes an input audio file
-acodec pcm_s16le sets linear16 audio encoding
-ac 1 sets mono channel
-segment_time 4800 chunks the input audio file at every 4800 seconds (80 minutes) and names each chunk filename_0.wav, filename_1.wav, etc.

Clean-up

Unmount your local directory

$ cd
$ fusermount -u local_bucket

Delete your bucket to avoid incurring charges to your account
Delete your VM instance

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README

Project Summary

Data

Data Size Limitations

Requirements

I. Google Cloud Platform (GCP) credentials

II. GCS bucket

III. VM

IV. Install packages

V. Mount local directory

Transcribe Audio file

Format Audio file

Clean-up

Reference

FilesExpand file tree

speech

Directory actions

More options

Directory actions

More options

Latest commit

History

speech

Folders and files

parent directory

README.md

README

Project Summary

Data

Data Size Limitations

Requirements

I. Google Cloud Platform (GCP) credentials

II. GCS bucket

III. VM

IV. Install packages

V. Mount local directory

Transcribe Audio file

Format Audio file

Clean-up

Reference