Skip to content

Commit 44e625a

Browse files
JarbasAlcoderabbitai[bot]dependabot[bot]
authored
Release 0.5.0a2 (#195)
* fix: ensure minimum extra plugins version (#179) * fix: ensure minimum extra plugins version * Update onnx.txt * Update requirements.txt * Update requirements/requirements.txt Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> * Update extras.txt --------- Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> * Increment Version to 0.4.2a1 * Update Changelog * Update requirements.txt * Update release_workflow.yml * Increment Version to 0.4.2a2 * Update pytest requirement from ~=7.1 to ~=8.1 in /requirements (#95) Updates the requirements on [pytest](https://github.com/pytest-dev/pytest) to permit the latest version. - [Release notes](https://github.com/pytest-dev/pytest/releases) - [Changelog](https://github.com/pytest-dev/pytest/blob/main/CHANGELOG.rst) - [Commits](pytest-dev/pytest@7.1.0...8.1.1) --- updated-dependencies: - dependency-name: pytest dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Increment Version to 0.4.2a3 * Update Changelog * Add pre-wakeword VAD state for reduced false activations (#189) * feat: pre-wake vad * feat: pre-wake vad * feat: pre-wake vad * 📝 Add docstrings to `prewakevad` (#190) Docstrings generation was requested by @JarbasAl. * #189 (comment) The following files were modified: * `ovos_dinkum_listener/voice_loop/voice_loop.py` Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> * keep some pre-vad chunks for ww --------- Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> * Increment Version to 0.4.2a4 * Update Changelog * Default to onnx (#193) * feat: default to onnx plugins * feat: default to onnx plugins * feat: default to onnx plugins * Increment Version to 0.4.2a5 * Update Changelog * Update version.py * Increment Version to 0.5.0a2 --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: JarbasAI <[email protected]> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Co-authored-by: JarbasAl <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2 parents 9ef3d3c + 52ecb42 commit 44e625a

File tree

10 files changed

+129
-53
lines changed

10 files changed

+129
-53
lines changed

.github/workflows/release_workflow.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
name: Release Alpha and Propose Stable
22

33
on:
4+
workflow_dispatch:
45
pull_request:
56
types: [closed]
67
branches: [dev]
78

89
jobs:
910
publish_alpha:
10-
if: github.event.pull_request.merged == true
1111
uses: TigreGotico/gh-automations/.github/workflows/publish-alpha.yml@master
1212
secrets: inherit
1313
with:

CHANGELOG.md

Lines changed: 31 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,40 @@
11
# Changelog
22

3-
## [0.4.1a1](https://github.com/OpenVoiceOS/ovos-dinkum-listener/tree/0.4.1a1) (2025-06-08)
3+
## [0.4.2a5](https://github.com/OpenVoiceOS/ovos-dinkum-listener/tree/0.4.2a5) (2025-11-04)
44

5-
[Full Changelog](https://github.com/OpenVoiceOS/ovos-dinkum-listener/compare/0.4.0...0.4.1a1)
5+
[Full Changelog](https://github.com/OpenVoiceOS/ovos-dinkum-listener/compare/0.4.2a4...0.4.2a5)
66

77
**Merged pull requests:**
88

9-
- fix: opm 1.X.X compat [\#177](https://github.com/OpenVoiceOS/ovos-dinkum-listener/pull/177) ([JarbasAl](https://github.com/JarbasAl))
9+
- Default to onnx [\#193](https://github.com/OpenVoiceOS/ovos-dinkum-listener/pull/193) ([JarbasAl](https://github.com/JarbasAl))
10+
11+
## [0.4.2a4](https://github.com/OpenVoiceOS/ovos-dinkum-listener/tree/0.4.2a4) (2025-11-04)
12+
13+
[Full Changelog](https://github.com/OpenVoiceOS/ovos-dinkum-listener/compare/0.4.2a3...0.4.2a4)
14+
15+
**Merged pull requests:**
16+
17+
- Add pre-wakeword VAD state for reduced false activations [\#189](https://github.com/OpenVoiceOS/ovos-dinkum-listener/pull/189) ([JarbasAl](https://github.com/JarbasAl))
18+
19+
## [0.4.2a3](https://github.com/OpenVoiceOS/ovos-dinkum-listener/tree/0.4.2a3) (2025-06-18)
20+
21+
[Full Changelog](https://github.com/OpenVoiceOS/ovos-dinkum-listener/compare/0.4.2a2...0.4.2a3)
22+
23+
**Merged pull requests:**
24+
25+
- Update pytest requirement from ~=7.1 to ~=8.1 in /requirements [\#95](https://github.com/OpenVoiceOS/ovos-dinkum-listener/pull/95) ([dependabot[bot]](https://github.com/apps/dependabot))
26+
27+
## [0.4.2a2](https://github.com/OpenVoiceOS/ovos-dinkum-listener/tree/0.4.2a2) (2025-06-18)
28+
29+
[Full Changelog](https://github.com/OpenVoiceOS/ovos-dinkum-listener/compare/0.4.2a1...0.4.2a2)
30+
31+
## [0.4.2a1](https://github.com/OpenVoiceOS/ovos-dinkum-listener/tree/0.4.2a1) (2025-06-12)
32+
33+
[Full Changelog](https://github.com/OpenVoiceOS/ovos-dinkum-listener/compare/0.4.1...0.4.2a1)
34+
35+
**Merged pull requests:**
36+
37+
- fix: ensure minimum extra plugins version [\#179](https://github.com/OpenVoiceOS/ovos-dinkum-listener/pull/179) ([JarbasAl](https://github.com/JarbasAl))
1038

1139

1240

README.md

Lines changed: 6 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -4,20 +4,10 @@ Documentation can be found in [the technical manual](https://openvoiceos.github.
44

55
## Install
66

7-
`pip install ovos-dinkum-listener[extras]` to install this package and the default
8-
plugins. Note that by default, either `tensorflow` or `tflite_runtime` will need
9-
to be installed separately for wakeword detection.
7+
`pip install ovos-dinkum-listener[extras]` to install this package and the default plugins.
108

11-
> If unable to install tflite_runtime in your platform, you can find wheels
12-
> here https://whl.smartgic.io/. eg, for pyhon 3.11 in x86
13-
> `pip install https://whl.smartgic.io/tflite_runtime-2.13.0-cp311-cp311-linux_x86_64.whl`
9+
Without `extras` you will also need to manually install, and possibly configure STT, WW, and VAD modules as described below.
1410

15-
Without `extras`, wakeword and STT audio upload will be disabled unless you install
16-
[`ovos-backend-client`](https://github.com/OpenVoiceOS/ovos-backend-client) separately. You will also need to manually install,
17-
and possibly configure STT, WW, and VAD modules as described below.
18-
19-
Using [ovos-vad-plugin-silero](https://github.com/OpenVoiceOS/ovos-vad-plugin-silero)
20-
is strongly recommended
2111

2212
## Configuration
2313

@@ -42,6 +32,10 @@ non exhaustive list of config options
4232
"microphone": {
4333
"module": "ovos-microphone-plugin-alsa"
4434
},
35+
// If enabled will only check for wakeword if VAD also detected speech
36+
// this should reduce false activations
37+
"vad_pre_wake_enabled": true,
38+
// Voice Activity Detection is used to determine when users are speaking
4539
VAD": {
4640
// recommended plugin: "ovos-vad-plugin-silero"
4741
"module": "ovos-vad-plugin-silero",

ovos_dinkum_listener/version.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# START_VERSION_BLOCK
22
VERSION_MAJOR = 0
3-
VERSION_MINOR = 4
4-
VERSION_BUILD = 1
5-
VERSION_ALPHA = 0
3+
VERSION_MINOR = 5
4+
VERSION_BUILD = 0
5+
VERSION_ALPHA = 2
66
# END_VERSION_BLOCK

ovos_dinkum_listener/voice_loop/voice_loop.py

Lines changed: 79 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -10,24 +10,23 @@
1010
# See the License for the specific language governing permissions and
1111
# limitations under the License.
1212
#
13-
import audioop
13+
1414
import time
1515
from collections import deque
1616
from dataclasses import dataclass, field
1717
from enum import Enum
18-
from threading import Event
1918
from typing import Callable, Deque, Optional
2019

20+
import audioop
2121
from ovos_config import Configuration
2222
from ovos_plugin_manager.stt import StreamingSTT
23+
from ovos_plugin_manager.templates.microphone import Microphone
2324
from ovos_plugin_manager.vad import VADEngine
2425
from ovos_utils.log import LOG
25-
from ovos_bus_client.session import SessionManager
26-
from ovos_dinkum_listener.transformers import AudioTransformersService
27-
from ovos_dinkum_listener.voice_loop.hotwords import HotwordContainer, HotwordState, HotWordException
28-
from ovos_plugin_manager.templates.microphone import Microphone
2926

3027
from ovos_dinkum_listener.plugins import FakeStreamingSTT
28+
from ovos_dinkum_listener.transformers import AudioTransformersService
29+
from ovos_dinkum_listener.voice_loop.hotwords import HotwordContainer, HotwordState, HotWordException
3130

3231

3332
class ListeningState(str, Enum):
@@ -44,6 +43,7 @@ class ListeningState(str, Enum):
4443
BEFORE_COMMAND = "before_cmd"
4544
IN_COMMAND = "in_cmd"
4645
AFTER_COMMAND = "after_cmd"
46+
PRE_WAKE_VAD = "pre_wake_vad"
4747

4848

4949
class ListeningMode(str, Enum):
@@ -144,11 +144,18 @@ class DinkumVoiceLoop(VoiceLoop):
144144
is_muted: bool = False
145145
_is_running: bool = False
146146
_chunk_info: ChunkInfo = field(default_factory=ChunkInfo)
147+
_vad_window_start: float = 0.0
148+
149+
# config flag for VAD-before-wakeword feature
150+
vad_pre_wake_enabled: bool = field(default_factory=lambda: Configuration().get("listener", {}).get("vad_pre_wake_enabled", False))
147151

148152
@property
149153
def running(self) -> bool:
150154
"""
151-
Return true while the loop is running
155+
Indicates whether the voice loop is currently running.
156+
157+
Returns:
158+
`true` if the loop is running, `false` otherwise.
152159
"""
153160
return self._is_running is True
154161

@@ -159,13 +166,12 @@ def reset_speech_timer(self):
159166

160167
def start(self):
161168
"""
162-
Start the Voice Loop; sets the listening mode based on configuration and
163-
prepares the loop to be run.
169+
Initialize and start the voice loop using configured listening mode.
170+
171+
Sets the internal running flag, selects ListeningMode from configuration (continuous, hybrid, or wakeword), sets the initial ListeningState to PRE_WAKE_VAD when vad_pre_wake_enabled is true otherwise DETECT_WAKEWORD, and resets the last wake-word timestamp.
164172
"""
165173

166174
self._is_running = True
167-
self.state = ListeningState.DETECT_WAKEWORD
168-
self.last_ww = -1
169175
listener_config = Configuration().get("listener", {})
170176
if listener_config.get("continuous_listen", False):
171177
self.listen_mode = ListeningMode.CONTINUOUS
@@ -174,19 +180,65 @@ def start(self):
174180
else:
175181
self.listen_mode = ListeningMode.WAKEWORD
176182

183+
# choose initial state based on config
184+
if self.vad_pre_wake_enabled:
185+
self.state = ListeningState.PRE_WAKE_VAD
186+
else:
187+
self.state = ListeningState.DETECT_WAKEWORD
188+
189+
self.last_ww = -1
177190
LOG.info(f"Listening mode: {self.listen_mode}")
191+
LOG.info(f"VAD pre-wake enabled: {self.vad_pre_wake_enabled}")
178192
LOG.debug(f"STATE: {self.state}")
179193

194+
def _pre_wake_vad(self, chunk: bytes):
195+
"""
196+
Monitor an audio chunk with VAD and transition to wake-word detection when speech is detected.
197+
198+
Sets self._chunk_info.is_speech according to the VAD result. If speech is detected, sets the loop state to ListeningState.DETECT_WAKEWORD and records the current time in self._vad_window_start. If no speech is detected, forwards the chunk to the audio transformers. On VAD errors, logs the error and treats the chunk as non-speech.
199+
200+
Parameters:
201+
chunk (bytes): Raw audio bytes for VAD analysis.
202+
"""
203+
self.hotword_chunks.append(chunk) # we still keep chunks for wake word detection
204+
try:
205+
self._chunk_info.is_speech = not self.vad.is_silence(chunk)
206+
except Exception as e:
207+
LOG.error(f"VAD error in pre-wake: {e}")
208+
self._chunk_info.is_speech = False
209+
if self._chunk_info.is_speech:
210+
LOG.debug("Speech detected - switching to wake word detection")
211+
self.state = ListeningState.DETECT_WAKEWORD
212+
self._vad_window_start = time.time()
213+
214+
rewind_chunks = []
215+
while self.hotword_chunks:
216+
rewind_chunks.append(self.hotword_chunks.popleft())
217+
218+
n_to_rewind = 5 # TODO from config
219+
for chunk in rewind_chunks[-n_to_rewind:]:
220+
# feed some pre-VAD detection audio to hotwords
221+
# VAD is usually a bit too late
222+
self.hotwords.update(chunk)
223+
else:
224+
self.transformers.feed_audio(chunk)
225+
180226
def run(self):
181227
"""
182-
Run the VoiceLoop so long as `self._is_running` is True
228+
Run the voice loop state machine, processing incoming audio chunks until the loop is stopped.
229+
230+
This method reads audio chunks from the microphone and advances the listening finite-state machine (pre-wake VAD, wakeword/hotword detection, waiting/recording/command handling, confirmation and teardown). It feeds audio to transformers and STT as appropriate, updates timers and per-chunk metadata, and invokes configured callbacks (chunk_callback, wake_callback, stt/audio/text callbacks, etc.). The loop continues while self._is_running and exits when the loop is stopped or the microphone read returns no audio.
183231
"""
184232
# Voice command state
185233
self.speech_seconds_left = self.speech_seconds
186234
self.silence_seconds_left = self.silence_seconds
187235
self.timeout_seconds_left = self.timeout_seconds
188-
self.timeout_seconds_with_silence_left = self.timeout_seconds_with_silence
189-
self.state = ListeningState.DETECT_WAKEWORD
236+
self.timeout_seconds_with_silence_left = self.timeout_seconds_with_silence
237+
238+
if self.vad_pre_wake_enabled:
239+
self.state = ListeningState.PRE_WAKE_VAD
240+
else:
241+
self.state = ListeningState.DETECT_WAKEWORD
190242

191243
# Keep hotword/STT audio so they can (optionally) be saved to disk
192244
self.hotword_chunks = deque(maxlen=self.num_hotword_keep_chunks)
@@ -226,7 +278,12 @@ def run(self):
226278
# AFTER_COMMAND -> DETECT_HOTWORD
227279
#
228280

229-
if self.state == ListeningState.DETECT_WAKEWORD:
281+
if self.state == ListeningState.PRE_WAKE_VAD:
282+
self._pre_wake_vad(chunk) # might change state to ListeningState.DETECT_WAKEWORD
283+
284+
elif self.state == ListeningState.DETECT_WAKEWORD:
285+
if self.vad_pre_wake_enabled and not self._vad_window_start:
286+
self._vad_window_start = time.time()
230287
try:
231288
if self.listen_mode == ListeningMode.CONTINUOUS:
232289
LOG.info(f"Continuous listening mode, updating state")
@@ -238,6 +295,11 @@ def run(self):
238295
LOG.info("Hotword detected")
239296
else:
240297
self.transformers.feed_audio(chunk)
298+
# handle timeout to return to VAD stage
299+
if self.vad_pre_wake_enabled and time.time() - self._vad_window_start > 5:
300+
LOG.debug("Wakeword not found within 5s - returning to PRE_WAKE_VAD")
301+
self.state = ListeningState.PRE_WAKE_VAD
302+
self._vad_window_start = 0
241303
except HotWordException as e:
242304
if self.hotwords.reload_on_failure:
243305
LOG.warning(e)
@@ -808,6 +870,8 @@ def _after_cmd(self, chunk: bytes):
808870
if self.listen_mode == ListeningMode.CONTINUOUS or \
809871
self.listen_mode == ListeningMode.HYBRID:
810872
self.state = ListeningState.WAITING_CMD
873+
elif self.vad_pre_wake_enabled:
874+
self.state = ListeningState.PRE_WAKE_VAD
811875
else:
812876
self.state = ListeningState.DETECT_WAKEWORD
813877
LOG.debug(f"STATE: {self.state}")

requirements/extras.txt

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,12 @@
11
# STT plugins
22
ovos-stt-plugin-server>=0.1.2,<1.0.0
3-
ovos-stt-plugin-chromium>=0.0.1,<1.0.0
43

54
# VAD plugins
6-
ovos-vad-plugin-noise>=0.1.2,<1.0.0
5+
ovos-vad-plugin-silero>=0.0.5,<1.0.0
76

87
# Microphone plugins
98
ovos-microphone-plugin-sounddevice>=0.0.1,<1.0.0
109

11-
# Wake Word plugins
12-
# note: tflite_runtime also need to be installed
13-
ovos-ww-plugin-precise-lite>=0.1,<1.0.0
14-
ovos-ww-plugin-vosk>=0.1,<1.0.0
10+
# WakeWord plugins
11+
ovos-ww-plugin-vosk>=0.1.7,<1.0.0
12+
ovos_ww_plugin_precise_onnx>=0.0.1,<1.0.0

requirements/onnx.txt

Lines changed: 0 additions & 7 deletions
This file was deleted.

requirements/requirements.txt

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
1-
ovos-plugin-manager>=1.0.2,<2.0.0
2-
ovos-utils>=0.0.38,<1.0.0
3-
ovos-config>=0.4.3,<2.0.0
4-
ovos_bus_client>=0.0.10,<2.0.0
1+
ovos-plugin-manager>=1.0.2,<3.0.0
2+
ovos-utils>=0.8.1,<1.0.0
3+
ovos-config>=1.2.2,<3.0.0
4+
ovos_bus_client>=1.3.4,<2.0.0
55
SpeechRecognition~=3.9

requirements/tests.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
pytest~=7.1
1+
pytest~=8.4
22
ovos-vad-plugin-webrtcvad
33
xdoctest~=1.1.5
44
pytest-cov>=3.0.0

setup.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -84,7 +84,6 @@ def get_description():
8484
extras_require={
8585
"extras": required("requirements/extras.txt"),
8686
"linux": required("requirements/linux.txt"),
87-
"onnx": required("requirements/onnx.txt"),
8887
"tests": required("requirements/tests.txt"),
8988
},
9089
classifiers=[

0 commit comments

Comments
 (0)