Release 0.5.0a2 (#195)

JarbasAl · coderabbitai[bot] · dependabot[bot] · web-flow · commit 44e625a8131a · 2025-11-05T18:09:37.000Z
* fix: ensure minimum extra plugins version (#179) * fix: ensure minimum extra plugins version * Update onnx.txt * Update requirements.txt * Update requirements/requirements.txt Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> * Update extras.txt --------- Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> * Increment Version to 0.4.2a1 * Update Changelog * Update requirements.txt * Update release_workflow.yml * Increment Version to 0.4.2a2 * Update pytest requirement from ~=7.1 to ~=8.1 in /requirements (#95) Updates the requirements on [pytest](https://github.com/pytest-dev/pytest) to permit the latest version. - [Release notes](https://github.com/pytest-dev/pytest/releases) - [Changelog](https://github.com/pytest-dev/pytest/blob/main/CHANGELOG.rst) - [Commits](pytest-dev/pytest@7.1.0...8.1.1) --- updated-dependencies: - dependency-name: pytest dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Increment Version to 0.4.2a3 * Update Changelog * Add pre-wakeword VAD state for reduced false activations (#189) * feat: pre-wake vad * feat: pre-wake vad * feat: pre-wake vad * 📝 Add docstrings to `prewakevad` (#190) Docstrings generation was requested by @JarbasAl. * #189 (comment) The following files were modified: * `ovos_dinkum_listener/voice_loop/voice_loop.py` Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> * keep some pre-vad chunks for ww --------- Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> * Increment Version to 0.4.2a4 * Update Changelog * Default to onnx (#193) * feat: default to onnx plugins * feat: default to onnx plugins * feat: default to onnx plugins * Increment Version to 0.4.2a5 * Update Changelog * Update version.py * Increment Version to 0.5.0a2 --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: JarbasAI <33701864+JarbasAl@users.noreply.github.com> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Co-authored-by: JarbasAl <JarbasAl@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
diff --git a/.github/workflows/release_workflow.yml b/.github/workflows/release_workflow.yml
@@ -1,13 +1,13 @@
 name: Release Alpha and Propose Stable
 
 on:
+  workflow_dispatch:
   pull_request:
     types: [closed]
     branches: [dev]
 
 jobs:
   publish_alpha:
-    if: github.event.pull_request.merged == true
     uses: TigreGotico/gh-automations/.github/workflows/publish-alpha.yml@master
     secrets: inherit
     with:
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,12 +1,40 @@
 # Changelog
 
-## [0.4.1a1](https://github.com/OpenVoiceOS/ovos-dinkum-listener/tree/0.4.1a1) (2025-06-08)
+## [0.4.2a5](https://github.com/OpenVoiceOS/ovos-dinkum-listener/tree/0.4.2a5) (2025-11-04)
 
-[Full Changelog](https://github.com/OpenVoiceOS/ovos-dinkum-listener/compare/0.4.0...0.4.1a1)
+[Full Changelog](https://github.com/OpenVoiceOS/ovos-dinkum-listener/compare/0.4.2a4...0.4.2a5)
 
 **Merged pull requests:**
 
-- fix: opm 1.X.X compat [\#177](https://github.com/OpenVoiceOS/ovos-dinkum-listener/pull/177) ([JarbasAl](https://github.com/JarbasAl))
+- Default to onnx [\#193](https://github.com/OpenVoiceOS/ovos-dinkum-listener/pull/193) ([JarbasAl](https://github.com/JarbasAl))
+
+## [0.4.2a4](https://github.com/OpenVoiceOS/ovos-dinkum-listener/tree/0.4.2a4) (2025-11-04)
+
+[Full Changelog](https://github.com/OpenVoiceOS/ovos-dinkum-listener/compare/0.4.2a3...0.4.2a4)
+
+**Merged pull requests:**
+
+- Add pre-wakeword VAD state for reduced false activations [\#189](https://github.com/OpenVoiceOS/ovos-dinkum-listener/pull/189) ([JarbasAl](https://github.com/JarbasAl))
+
+## [0.4.2a3](https://github.com/OpenVoiceOS/ovos-dinkum-listener/tree/0.4.2a3) (2025-06-18)
+
+[Full Changelog](https://github.com/OpenVoiceOS/ovos-dinkum-listener/compare/0.4.2a2...0.4.2a3)
+
+**Merged pull requests:**
+
+- Update pytest requirement from ~=7.1 to ~=8.1 in /requirements [\#95](https://github.com/OpenVoiceOS/ovos-dinkum-listener/pull/95) ([dependabot[bot]](https://github.com/apps/dependabot))
+
+## [0.4.2a2](https://github.com/OpenVoiceOS/ovos-dinkum-listener/tree/0.4.2a2) (2025-06-18)
+
+[Full Changelog](https://github.com/OpenVoiceOS/ovos-dinkum-listener/compare/0.4.2a1...0.4.2a2)
+
+## [0.4.2a1](https://github.com/OpenVoiceOS/ovos-dinkum-listener/tree/0.4.2a1) (2025-06-12)
+
+[Full Changelog](https://github.com/OpenVoiceOS/ovos-dinkum-listener/compare/0.4.1...0.4.2a1)
+
+**Merged pull requests:**
+
+- fix: ensure minimum extra plugins version [\#179](https://github.com/OpenVoiceOS/ovos-dinkum-listener/pull/179) ([JarbasAl](https://github.com/JarbasAl))
 
 
 
diff --git a/README.md b/README.md
@@ -4,20 +4,10 @@ Documentation can be found in [the technical manual](https://openvoiceos.github.
 
 ## Install
 
-`pip install ovos-dinkum-listener[extras]` to install this package and the default
-plugins. Note that by default, either `tensorflow` or `tflite_runtime` will need
-to be installed separately for wakeword detection.
+`pip install ovos-dinkum-listener[extras]` to install this package and the default plugins.
 
-> If unable to install tflite_runtime in your platform, you can find wheels
-> here https://whl.smartgic.io/. eg, for pyhon 3.11 in x86
-> `pip install https://whl.smartgic.io/tflite_runtime-2.13.0-cp311-cp311-linux_x86_64.whl`
+Without `extras` you will also need to manually install, and possibly configure STT, WW, and VAD modules as described below.
 
-Without `extras`, wakeword and STT audio upload will be disabled unless you install 
-[`ovos-backend-client`](https://github.com/OpenVoiceOS/ovos-backend-client) separately. You will also need to manually install,
-and possibly configure STT, WW, and VAD modules as described below.
-
-Using [ovos-vad-plugin-silero](https://github.com/OpenVoiceOS/ovos-vad-plugin-silero) 
-is strongly recommended
 
 ## Configuration
 
@@ -42,6 +32,10 @@ non exhaustive list of config options
     "microphone": {
       "module": "ovos-microphone-plugin-alsa"
     },
+    // If enabled will only check for wakeword if VAD also detected speech
+    // this should reduce false activations
+    "vad_pre_wake_enabled": true,
+    // Voice Activity Detection is used to determine when users are speaking
     VAD": {
      // recommended plugin: "ovos-vad-plugin-silero"
      "module": "ovos-vad-plugin-silero",
diff --git a/ovos_dinkum_listener/version.py b/ovos_dinkum_listener/version.py
@@ -1,6 +1,6 @@
 # START_VERSION_BLOCK
 VERSION_MAJOR = 0
-VERSION_MINOR = 4
-VERSION_BUILD = 1
-VERSION_ALPHA = 0
+VERSION_MINOR = 5
+VERSION_BUILD = 0
+VERSION_ALPHA = 2
 # END_VERSION_BLOCK
diff --git a/ovos_dinkum_listener/voice_loop/voice_loop.py b/ovos_dinkum_listener/voice_loop/voice_loop.py
@@ -10,24 +10,23 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 #
-import audioop
+
 import time
 from collections import deque
 from dataclasses import dataclass, field
 from enum import Enum
-from threading import Event
 from typing import Callable, Deque, Optional
 
+import audioop
 from ovos_config import Configuration
 from ovos_plugin_manager.stt import StreamingSTT
+from ovos_plugin_manager.templates.microphone import Microphone
 from ovos_plugin_manager.vad import VADEngine
 from ovos_utils.log import LOG
-from ovos_bus_client.session import SessionManager
-from ovos_dinkum_listener.transformers import AudioTransformersService
-from ovos_dinkum_listener.voice_loop.hotwords import HotwordContainer, HotwordState, HotWordException
-from ovos_plugin_manager.templates.microphone import Microphone
 
 from ovos_dinkum_listener.plugins import FakeStreamingSTT
+from ovos_dinkum_listener.transformers import AudioTransformersService
+from ovos_dinkum_listener.voice_loop.hotwords import HotwordContainer, HotwordState, HotWordException
 
 
 class ListeningState(str, Enum):
@@ -44,6 +43,7 @@ class ListeningState(str, Enum):
     BEFORE_COMMAND = "before_cmd"
     IN_COMMAND = "in_cmd"
     AFTER_COMMAND = "after_cmd"
+    PRE_WAKE_VAD = "pre_wake_vad"
 
 
 class ListeningMode(str, Enum):
@@ -144,11 +144,18 @@ class DinkumVoiceLoop(VoiceLoop):
     is_muted: bool = False
     _is_running: bool = False
     _chunk_info: ChunkInfo = field(default_factory=ChunkInfo)
+    _vad_window_start: float = 0.0
+
+    # config flag for VAD-before-wakeword feature
+    vad_pre_wake_enabled: bool = field(default_factory=lambda: Configuration().get("listener", {}).get("vad_pre_wake_enabled", False))
 
     @property
     def running(self) -> bool:
         """
-        Return true while the loop is running
+        Indicates whether the voice loop is currently running.
+
+        Returns:
+            `true` if the loop is running, `false` otherwise.
         """
         return self._is_running is True
     
@@ -159,13 +166,12 @@ def reset_speech_timer(self):
 
     def start(self):
         """
-        Start the Voice Loop; sets the listening mode based on configuration and
-        prepares the loop to be run.
+        Initialize and start the voice loop using configured listening mode.
+
+        Sets the internal running flag, selects ListeningMode from configuration (continuous, hybrid, or wakeword), sets the initial ListeningState to PRE_WAKE_VAD when vad_pre_wake_enabled is true otherwise DETECT_WAKEWORD, and resets the last wake-word timestamp.
         """
 
         self._is_running = True
-        self.state = ListeningState.DETECT_WAKEWORD
-        self.last_ww = -1
         listener_config = Configuration().get("listener", {})
         if listener_config.get("continuous_listen", False):
             self.listen_mode = ListeningMode.CONTINUOUS
@@ -174,19 +180,65 @@ def start(self):
         else:
             self.listen_mode = ListeningMode.WAKEWORD
 
+        # choose initial state based on config
+        if self.vad_pre_wake_enabled:
+            self.state = ListeningState.PRE_WAKE_VAD
+        else:
+            self.state = ListeningState.DETECT_WAKEWORD
+
+        self.last_ww = -1
         LOG.info(f"Listening mode: {self.listen_mode}")
+        LOG.info(f"VAD pre-wake enabled: {self.vad_pre_wake_enabled}")
         LOG.debug(f"STATE: {self.state}")
 
+    def _pre_wake_vad(self, chunk: bytes):
+        """
+        Monitor an audio chunk with VAD and transition to wake-word detection when speech is detected.
+
+        Sets self._chunk_info.is_speech according to the VAD result. If speech is detected, sets the loop state to ListeningState.DETECT_WAKEWORD and records the current time in self._vad_window_start. If no speech is detected, forwards the chunk to the audio transformers. On VAD errors, logs the error and treats the chunk as non-speech.
+
+        Parameters:
+            chunk (bytes): Raw audio bytes for VAD analysis.
+        """
+        self.hotword_chunks.append(chunk)  # we still keep chunks for wake word detection
+        try:
+            self._chunk_info.is_speech = not self.vad.is_silence(chunk)
+        except Exception as e:
+            LOG.error(f"VAD error in pre-wake: {e}")
+            self._chunk_info.is_speech = False
+        if self._chunk_info.is_speech:
+            LOG.debug("Speech detected - switching to wake word detection")
+            self.state = ListeningState.DETECT_WAKEWORD
+            self._vad_window_start = time.time()
+
+            rewind_chunks = []
+            while self.hotword_chunks:
+                rewind_chunks.append(self.hotword_chunks.popleft())
+
+            n_to_rewind = 5 # TODO from config
+            for chunk in rewind_chunks[-n_to_rewind:]:
+                # feed some pre-VAD detection audio to hotwords
+                # VAD is usually a bit too late
+                self.hotwords.update(chunk)
+        else:
+            self.transformers.feed_audio(chunk)
+
     def run(self):
         """
-        Run the VoiceLoop so long as `self._is_running` is True
+        Run the voice loop state machine, processing incoming audio chunks until the loop is stopped.
+
+        This method reads audio chunks from the microphone and advances the listening finite-state machine (pre-wake VAD, wakeword/hotword detection, waiting/recording/command handling, confirmation and teardown). It feeds audio to transformers and STT as appropriate, updates timers and per-chunk metadata, and invokes configured callbacks (chunk_callback, wake_callback, stt/audio/text callbacks, etc.). The loop continues while self._is_running and exits when the loop is stopped or the microphone read returns no audio.
         """
         # Voice command state
         self.speech_seconds_left = self.speech_seconds
         self.silence_seconds_left = self.silence_seconds
         self.timeout_seconds_left = self.timeout_seconds
-        self.timeout_seconds_with_silence_left = self.timeout_seconds_with_silence        
-        self.state = ListeningState.DETECT_WAKEWORD
+        self.timeout_seconds_with_silence_left = self.timeout_seconds_with_silence
+
+        if self.vad_pre_wake_enabled:
+            self.state = ListeningState.PRE_WAKE_VAD
+        else:
+            self.state = ListeningState.DETECT_WAKEWORD
 
         # Keep hotword/STT audio so they can (optionally) be saved to disk
         self.hotword_chunks = deque(maxlen=self.num_hotword_keep_chunks)
@@ -226,7 +278,12 @@ def run(self):
             # AFTER_COMMAND -> DETECT_HOTWORD
             #
 
-            if self.state == ListeningState.DETECT_WAKEWORD:
+            if self.state == ListeningState.PRE_WAKE_VAD:
+                self._pre_wake_vad(chunk) # might change state to ListeningState.DETECT_WAKEWORD
+
+            elif self.state == ListeningState.DETECT_WAKEWORD:
+                if self.vad_pre_wake_enabled and not self._vad_window_start:
+                    self._vad_window_start = time.time()
                 try:
                     if self.listen_mode == ListeningMode.CONTINUOUS:
                         LOG.info(f"Continuous listening mode, updating state")
@@ -238,6 +295,11 @@ def run(self):
                         LOG.info("Hotword detected")
                     else:
                         self.transformers.feed_audio(chunk)
+                        # handle timeout to return to VAD stage
+                        if self.vad_pre_wake_enabled and time.time() - self._vad_window_start > 5:
+                            LOG.debug("Wakeword not found within 5s - returning to PRE_WAKE_VAD")
+                            self.state = ListeningState.PRE_WAKE_VAD
+                            self._vad_window_start = 0
                 except HotWordException as e:
                     if self.hotwords.reload_on_failure:
                         LOG.warning(e)
@@ -808,6 +870,8 @@ def _after_cmd(self, chunk: bytes):
         if self.listen_mode == ListeningMode.CONTINUOUS or \
                 self.listen_mode == ListeningMode.HYBRID:
             self.state = ListeningState.WAITING_CMD
+        elif self.vad_pre_wake_enabled:
+            self.state = ListeningState.PRE_WAKE_VAD
         else:
             self.state = ListeningState.DETECT_WAKEWORD
         LOG.debug(f"STATE: {self.state}")
diff --git a/requirements/extras.txt b/requirements/extras.txt
@@ -1,14 +1,12 @@
 # STT plugins
 ovos-stt-plugin-server>=0.1.2,<1.0.0
-ovos-stt-plugin-chromium>=0.0.1,<1.0.0
 
 # VAD plugins
-ovos-vad-plugin-noise>=0.1.2,<1.0.0
+ovos-vad-plugin-silero>=0.0.5,<1.0.0
 
 # Microphone plugins
 ovos-microphone-plugin-sounddevice>=0.0.1,<1.0.0
 
-# Wake Word plugins
-# note: tflite_runtime also need to be installed
-ovos-ww-plugin-precise-lite>=0.1,<1.0.0
-ovos-ww-plugin-vosk>=0.1,<1.0.0
+# WakeWord plugins
+ovos-ww-plugin-vosk>=0.1.7,<1.0.0
+ovos_ww_plugin_precise_onnx>=0.0.1,<1.0.0
diff --git a/requirements/onnx.txt b/requirements/onnx.txt
diff --git a/requirements/requirements.txt b/requirements/requirements.txt
@@ -1,5 +1,5 @@
-ovos-plugin-manager>=1.0.2,<2.0.0
-ovos-utils>=0.0.38,<1.0.0
-ovos-config>=0.4.3,<2.0.0
-ovos_bus_client>=0.0.10,<2.0.0
+ovos-plugin-manager>=1.0.2,<3.0.0
+ovos-utils>=0.8.1,<1.0.0
+ovos-config>=1.2.2,<3.0.0
+ovos_bus_client>=1.3.4,<2.0.0
 SpeechRecognition~=3.9
diff --git a/requirements/tests.txt b/requirements/tests.txt
@@ -1,4 +1,4 @@
-pytest~=7.1
+pytest~=8.4
 ovos-vad-plugin-webrtcvad
 xdoctest~=1.1.5
 pytest-cov>=3.0.0
diff --git a/setup.py b/setup.py
@@ -84,7 +84,6 @@ def get_description():
     extras_require={
         "extras": required("requirements/extras.txt"),
         "linux": required("requirements/linux.txt"),
-        "onnx": required("requirements/onnx.txt"),
         "tests": required("requirements/tests.txt"),
     },
     classifiers=[

Original file line number	Diff line number	Diff line change
`@@ -1,4 +1,4 @@`
`1`		`-pytest~=7.1`
	`1`	`+pytest~=8.4`
`2`	`2`	`ovos-vad-plugin-webrtcvad`
`3`	`3`	`xdoctest~=1.1.5`
`4`	`4`	`pytest-cov>=3.0.0`