Skip to content

Latest commit

 

History

History
43 lines (34 loc) · 1.66 KB

README.md

File metadata and controls

43 lines (34 loc) · 1.66 KB

Transfer Learning on UrbanSounds8K

In this project, we use the pre-trained features from the Google AudioSet model available in https://github.com/tensorflow/models/tree/master/research/audioset and use it on the UrbanSounds8K dataset.


git clone https://www.github.com/abhyantrika/Transfer_Learning.git
cd Transfer_Learning

Run pip install -r requirements.txt for installing the python dependencies.

Now download the pre trained weights and other requirements for the vggish model to run:
curl -O https://storage.googleapis.com/audioset/vggish_model.ckpt
curl -O https://storage.googleapis.com/audioset/vggish_pca_params.npz
python vggish_smoke_test.py
This should not throw up any errors.

Visit https://serv.cusp.nyu.edu/projects/urbansounddataset/
and fill the form to download the UrbanSounds8K dataset and save it in the same directory and extract the tar file.

After downloading,run
mkdir make_dirs.sh
to create new directories for storing processed data.

Run python pre_process.py
This will take the wav files of UrbanSounds8K dataset and feed forward it through the vggish pre trained network and will save the 128X4 embeddings generated as an npz file in the features_npz directory.

Now run python train_dnn.py to train a simple 3 layer network for classifying the npz features.
You can run python test_dnn.py for testing the model. This repo has trained weights and the model description in model.h5 and model.json files, respectively.

contributors: @codeass @vishwajit123