OpenDino toy ESP32 push-to-talk with GPT-4o mini Realtime over raw WebSockets #1930
+211
−1
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This PR introduces a new example that demonstrates a practical implementation of the GPT-4o mini Realtime API on an ESP32.
Unlike existing tutorials, it streams audio via raw WebSockets only—no local bridge, Raspberry Pi, or PC required. Because an ESP32 has very limited RAM and CPU, the code is carefully optimised to keep the entire dialogue stack on the microcontroller. To the best of my knowledge, this is the first open-source example on GitHub that connects an ESP32 directly to OpenAI’s Realtime endpoint.
Motivation
Today there is no reference project showing how to run OpenAI Realtime over WebSockets on a bare ESP32. This example fills that gap by proving:
Push-to-talk conversation with round-trip latency under one second
JSON-Schema function calling handled entirely on the microcontroller
A minimal, reproducible hardware setup that others can copy or adapt
By publishing this, I hope to unblock—and inspire—developers who want to bring real-time LLM interaction to low-cost, resource-constrained devices.
For new content
When contributing new content, read through our contribution guidelines, and mark the following action items as completed:
We will rate each of these areas on a scale from 1 to 4, and will only accept contributions that score 3 or higher on all areas. Refer to our contribution guidelines for more details.