audio_capture

This repositiory provides a set of ROS 2 packages for audio. It provides a C++ version to capture and play audio data using PortAudio.

ROS 2 Distro	Branch	Build status	Docker Image	Documentation
Foxy	`main`
Galactic	`main`
Humble	`main`
Iron	`main`
Jazzy	`main`
Kilted	`main`
Rolling	`main`

Installation

cd ~/ros2_ws/src
git clone https://github.com/mgonzs13/audio_common.git
cd ~/ros2_ws
rosdep install --from-paths src --ignore-src -r -y
colcon build

Docker

You can create a docker image to test audio_common. Use the following command inside the directory of audio_common.

docker build -t audio_common .

After the image is created, run a docker container with the following command.

docker run -it --rm --device /dev/snd audio_common

Nodes

audio_capturer_node

Node to obtain audio data from a microphone and publish it into the audio topic.

Click to expand

Parameters

format: Specifies the audio format to be used for capturing. Possible values are:
- 1 (paFloat32 - 32-bit floating point)
- 2 (paInt32 - 32-bit integer)
- 8 (paInt16 - 16-bit integer)
- 16 (paInt8 - 8-bit integer)
- 32 (paUInt8 - 8-bit unsigned integer)
Default: 8 (paInt16)

The integer values correspond to PortAudio sample format flags.
channels: The number of audio channels to capture. Typically, 1 for mono and 2 for stereo. Default: 1
rate: The sample rate that is how many samples per second should be captured. Default: 16000
chunk: The size of each audio frame. Default: 512
device: The ID of the audio input device. A value of -1 indicates that the default audio input device should be used. Default: -1
frame_id: An identifier for the audio frame. This can be useful for synchronizing audio data with other data streams. Default: ""

ROS 2 Interfaces

audio: Topic to publish the audio data captured from the microphone. Type: audio_common_msgs/msg/AudioStamped

audio_player_node

Node to play the audio data obtained from the audio topic.

Click to expand

Parameters

channels: The number of audio channels to play. Typically, 1 for mono and 2 for stereo. Default: 2
- The node automatically handles conversion between mono and stereo formats if needed.
device: The ID of the audio output device. A value of -1 indicates that the default audio output device should be used. Default: -1

ROS 2 Interfaces

audio: Topic subscriber to get the audio data to be played. Type: audio_common_msgs/msg/AudioStamped

music_node

Node to play music from audio files in wav format.

Click to expand

Parameters

chunk: The size of each audio frame. Default: 2048
frame_id: An identifier for the audio frame. This can be useful for synchronizing audio data with other data streams. Default: ""

ROS 2 Interfaces

audio: Topic to publish the audio data from the files. Type: audio_common_msgs/msg/AudioStamped
music_play: Service to play audio files. Type: audio_common_msgs/srv/MusicPlay
- Parameters:
  - audio: Name of a built-in audio sample (e.g., "elevator")
  - file_path: Path to a custom WAV file (ignored if audio is specified)
  - loop: Boolean to indicate if the audio should loop. Default: false
music_stop: Service to stop the currently playing music. Type: std_srvs/srv/Trigger
music_pause: Service to pause the currently playing music. Type: std_srvs/srv/Trigger
music_resume: Service to resume paused music. Type: std_srvs/srv/Trigger

tts_node

Node to generate audio from text (TTS) using espeak.

Click to expand

Parameters

chunk: The size of each audio frame. Default: 4096
frame_id: An identifier for the audio frame. This can be useful for synchronizing audio data with other data streams. Default: ""

ROS 2 Interfaces

audio: Topic publisher to send the audio data generated by the TTS. Type: audio_common_msgs/msg/AudioStamped
say: Action to generate audio data from a text. Type: audio_common_msgs/action/TTS
- Goal:
  - text: The text to convert to speech
  - language: The language to use for speech synthesis. Default: "en"
  - volume: The volume of the generated speech (0.0-1.0). Default: 1.0
  - rate: The speech rate (1.0 is normal speed). Default: 1.0
- Feedback:
  - audio: The audio being currently played
- Result:
  - text: The text that was converted to speech

Demos

Audio Capturer/Player

ros2 run audio_common audio_capturer_node

ros2 run audio_common audio_player_node

TTS

ros2 run audio_common tts_node

ros2 run audio_common audio_player_node

ros2 action send_goal /say audio_common_msgs/action/TTS "{'text': 'Hello World'}"

Advanced TTS example with additional parameters:

ros2 action send_goal /say audio_common_msgs/action/TTS "{'text': 'Hello World', 'language': 'en-us', 'volume': 0.8, 'rate': 1.2}"

Music Player

ros2 run audio_common music_node

ros2 run audio_common audio_player_node

Play a built-in sample:

ros2 service call /music_play audio_common_msgs/srv/MusicPlay "{audio: 'elevator'}"

Play a custom WAV file:

ros2 service call /music_play audio_common_msgs/srv/MusicPlay "{file_path: '/path/to/your/file.wav'}"

Play with looping enabled:

ros2 service call /music_play audio_common_msgs/srv/MusicPlay "{audio: 'elevator', loop: true}"

Control playback:

ros2 service call /music_pause std_srvs/srv/Trigger "{}"
ros2 service call /music_resume std_srvs/srv/Trigger "{}"
ros2 service call /music_stop std_srvs/srv/Trigger "{}"

Name		Name	Last commit message	Last commit date
Latest commit History 92 Commits
.github		.github
audio_common		audio_common
audio_common_msgs		audio_common_msgs
.gitignore		.gitignore
CITATION.cff		CITATION.cff
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

audio_capture

Table of Contents

Installation

Docker

Nodes

audio_capturer_node

Parameters

ROS 2 Interfaces

audio_player_node

Parameters

ROS 2 Interfaces

music_node

Parameters

ROS 2 Interfaces

tts_node

Parameters

ROS 2 Interfaces

Demos

Audio Capturer/Player

TTS

Music Player

About

Uh oh!

Releases 22

Packages

Uh oh!

Contributors 4

Uh oh!

Languages

License

mgonzs13/audio_common

Folders and files

Latest commit

History

Repository files navigation

audio_capture

Table of Contents

Installation

Docker

Nodes

audio_capturer_node

Parameters

ROS 2 Interfaces

audio_player_node

Parameters

ROS 2 Interfaces

music_node

Parameters

ROS 2 Interfaces

tts_node

Parameters

ROS 2 Interfaces

Demos

Audio Capturer/Player

TTS

Music Player

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 22

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Packages