Gesture-to-Audio Synthesis

This repository contains code for synthesizing audio from gesture data, using user-defined paired gesture/audio data.

Installation

Clone the repository:

git clone https://github.com/mhrice/gesture-to-audio.git
cd gesture-to-audio

Create and activate a virtual environment:

python3 -m venv env
source env/bin/activate

Install the required packages:
```
pip install -e .
```

If you have issues with accidentally installing GPU drivers, restart and try with torch explicitly

deactivate
rm -rf env
python3 -m venv env
source env/bin/activate
pip install torch torchaudio torchvision
pip install -e .

For training logs, you'll need a free Weights & Biases account account. Set up your API key:

wandb login

Usage

Prepare your paired gesture/audio dataset. To capture with the microphone/camera, run python scripts/record_dataset.py script. This will save the dataset in a folder named recorded_dataset/ by default.
```
python scripts/record_dataset.py --duration 200 (in milliseconds)
```

Preprocess the dataset:

python scripts/process_dataset.py recorded_dataset

Train the model:
```
python scripts/train.py recorded_dataset --duration 200
```
This will save model checkpoints in the gesture-to-audio/ directory.

Synthesize audio from new gesture data:

python scripts/non_realtime_test.py /path/to/your/checkpoint.ckpt --duration 200

Existing checkpoints for my paired audio/gesture data can be found here.

Diagrams

Training

Inference

More project details here

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
gesture_to_audio		gesture_to_audio
scripts		scripts
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Gesture-to-Audio Synthesis

Installation

Usage

Diagrams

Training

Inference

About

Uh oh!

Releases

Packages

Languages

mhrice/gesture-to-audio

Folders and files

Latest commit

History

Repository files navigation

Gesture-to-Audio Synthesis

Installation

Usage

Diagrams

Training

Inference

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages