The VU sound corpus

Emiel van Miltenburg, Benjamin Timmermans, and Lora Aroyo (2015)

Vrije Universiteit Amsterdam

This repository contains all the data and code that was used to annotate sounds from the Freesound.org database. If you use this data, please cite our paper. Also consider a donation to www.freesound.org :)

A tool to browse the VU Sound Corpus is available here.

Data

The folder ./steps/4-results/ contains all of our results, including results.xml which is the XML file that contains all the annotation data and soundcollection.dtd, which specifies the structure our resource. There are also four subfolders:

Frequencies: this folder contains CSV files with frequency counts for all (author, raw, clustered, search) tags.
Search_matches_per_sound: this folder contains a CSV file with the results from our search experiment.
typical_normalized and typical_raw: these folders contain lists with typical keywords for the original authors and the crowd annotations. I.e. words that these different groups are biased to use in their annotations.

XML format

The diagram below shows the XML structure of our resource. We represent our data as a collection of sounds. Tags that may occur multiple times are marked with an asterisk.

Sounds have the following attributes: id, batch, name, type, samplerate, duration, channels, bitrate and bitdepth (the id and name attributes correspond to the ID and name in the Freesound.org database, and the batch attribute corresponds to the task batch in the crowdsourcing process, for full transparency about the data collection).

Sounds also have a number of elements: file, uri, descriptions, webrating and author-tags correspond to the Freesound.org metadata (with file-elements linking to high-quality MP3 and OGG files). The crowd-tags element contains the normalized tags as tag-elements, which in turn contain the raw tags that they subsume. The ratings-element provides information about the quality of the sound: webrating contains the user-rating from Freesound.org, and clarity contains the automatically generated clarity rating (based on the clustered tags).

How to load the sound data

Loading the data in Python is very simple: first import the etree module from lxml, and then parse the results.xml file.

# Import lxml:
from lxml import etree

# Load the data:
xml  = etree.parse('./steps/4-results/results.xml')
root = xml.getroot()

Selecting sounds with particular properties

We can use XPATH-expressions to find sounds with particular properties, e.g. with a certain duration or bitdepth.

short_sounds = root.xpath('./sound[starts-with(@duration,"0.")]')
bitdepth_24  = root.xpath('./sound[@bitdepth="24"]')

By crowd tag

Here is how to find all sounds with a 'bang' in them.

bang_sounds = root.xpath('./sound[crowd-tags/tag/raw[@label="bang"]]')

By their description

We can also use the metadata of the sound to find the recordings you're after. Here is some code to get all the sounds that have a particular word in their description (e.g. 'synth'):

sounds_synth = root.xpath('./sound[description[contains(.,"synth")]]')

By original tag

Let's look for sounds that the original author tagged 'vintage':

vintage_sounds = root.xpath('./sound[author-tags/tag[@label="vintage"]]')

Code & Replication

Please find our code in the scripts folder. Our code was written in a combination of Python 2 (files 0-4) and Python 3 (files 5-8). To replicate our work, run the scripts in order.

Requirements

Files 0,1 require unicodecsv to be installed.
File 1 requires a distributional model in Word2Vec format. We used the GoogleNews model from here, that was trained on 100bn words.
Files 2-4 require the CrowdTruth framework to be installed.
Files 2-3 require the requests library to interface with the CrowdTruth framework.
Files 3-8 require the lxml library to parse/generate XML.
File 6 requires the tabulate library.
File 8 requires matplotlib-venn to be installed.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
poster and paper		poster and paper
resources		resources
scripts		scripts
steps		steps
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

The VU sound corpus

Data

XML format

How to load the sound data

Selecting sounds with particular properties

By crowd tag

By their description

By original tag

Code & Replication

About

Uh oh!

Releases 1

Packages

Languages

License

CrowdTruth/VU-Sound-Corpus

Folders and files

Latest commit

History

Repository files navigation

The VU sound corpus

Data

XML format

How to load the sound data

Selecting sounds with particular properties

By crowd tag

By their description

By original tag

Code & Replication

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages