This repository packages the melody
model from MusicGen as a Truss.
MusicGen is a simple and controllable suite of models for music generation developed by Facebook AI Research. The melody
model accepts both text and audio to condition it's outputs.
Utilizing this model for inference can be challenging given the hardware requirements. With Baseten and Truss, inference is dead simple.
First, clone this repository:
git clone https://github.com/basetenlabs/truss-examples/
cd musicgen-melody-truss
Before deployment:
- Make sure you have a Baseten account and API key.
- Install the latest version of Truss:
pip install --upgrade truss
With musicgen-melody-truss
as your working directory, you can deploy the model with:
truss push
Paste your Baseten API key if prompted.
For more information, see Truss documentation.
We found this model runs reasonably fast on A10Gs; you can configure the hardware you'd like in the config.yaml.
resources:
cpu: "3"
memory: 14Gi
use_gpu: true
accelerator: A10G
MusicGen takes a list of prompts and a duration in seconds. You may also, optionally, provide a base64 encoded WAV file as the melody to condition on. It will generate one clip per prompt and return each clip as a base64 encoded WAV file.
truss predict -d '{"prompts": ["happy rock" "energetic EDM", "sad jazz"], "melody" : "b64_encoded_melody", "duration": 8}'
You'll want to pipe your results into a script such as:
import json
import base64
import os, sys
model_output = json.loads(sys.stdin.read())
for idx, clip in enumerate(model_output["data"]):
with open(f"clip_{idx}.wav", "wb") as f:
f.write(base64.b64decode(clip))
You can also invoke your model via a REST API
curl -X POST " https://app.baseten.co/models/YOUR_MODEL_ID/predict" \
-H "Content-Type: application/json" \
-H 'Authorization: Api-Key {YOUR_API_KEY}' \
-d '{
"prompts": ["happy rock" "energetic EDM", "sad jazz"],
"melody" : "b64_encoded_melody",
"duration": 8
}'