-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Add initial GPU support #4
Open
edurenye
wants to merge
22
commits into
rhasspy:master
Choose a base branch
from
edurenye:gpu
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+739
−9
Open
Changes from all commits
Commits
Show all changes
22 commits
Select commit
Hold shift + click to select a range
1789209
Add initial GPU support
edurenye 0f1c13f
Add support for both GPU and NONGPU Dockerfiles
edurenye 4180e2f
Add docker compose and instructions
edurenye 5390a78
Make whisper use cuda
edurenye daa59aa
Update Dockerfiles and add Vosk
edurenye 0834ae7
Create __main__.py
baudneo 3cde252
Create process.py
baudneo d4698b2
Update docker-compose.gpu.yml
baudneo 4ac9e2b
Remove volumes, they can be added by extended docker compose files
edurenye 9416a1c
Fixes from rebases
edurenye 40c9484
Fixes from rebases
edurenye 2a64f70
Fixes from rebases
edurenye 535b63a
Use YAML anchores, all commented services, and add whisper-cpp
edurenye 1e0647f
Simplify everything using BASE image args
edurenye 7039b94
Simplify everything using BASE image args
edurenye 373cb99
Add default values and use env for runtime
edurenye 9403625
Go back to two files for piper and updated piper gpu
edurenye 20787c3
Add microwakeword option
edurenye c7e78e4
chore: Add rhasspy-speech option
edurenye d4dc45e
chore: Update images and whisper version
edurenye 6045af2
Go back to openwakeword 1.8.2
edurenye 95b40c0
Fix errors and warnings
edurenye File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,72 @@ | ||
### YAML Anchors ### | ||
x-common: &common | ||
restart: unless-stopped | ||
|
||
#### | ||
services: | ||
wyoming-piper: | ||
build: | ||
context: ./piper/ | ||
ports: | ||
- "10200:10200" | ||
command: [ "--voice", "en_US-lessac-medium" ] | ||
<<: [ *common ] | ||
|
||
wyoming-whisper: | ||
build: | ||
context: ./whisper/ | ||
ports: | ||
- "10300:10300" | ||
command: [ "--model", "tiny-int8", "--language", "en" ] | ||
<<: [ *common ] | ||
|
||
# wyoming-whispercpp: | ||
# build: | ||
# context: ./whisper-cpp/ | ||
# ports: | ||
# - "10300:10300" | ||
# command: [ "--model", "tiny-int8", "--language", "en" ] | ||
# <<: [ *common ] | ||
|
||
wyoming-openwakeword: | ||
build: | ||
context: ./openwakeword/ | ||
ports: | ||
- "10400:10400" | ||
command: [ "--preload-model", "ok_nabu" ] | ||
<<: [ *common ] | ||
|
||
# wyoming-porcupine: | ||
# build: | ||
# context: ./porcupine1/ | ||
# ports: | ||
# - "10400:10400" | ||
# <<: [ *common ] | ||
|
||
# wyoming-snowboy: | ||
# build: | ||
# context: ./snowboy/ | ||
# ports: | ||
# - "10400:10400" | ||
# <<: [ *common ] | ||
|
||
# wyoming-vosk: | ||
# build: | ||
# context: ./vosk/ | ||
# ports: | ||
# - "10400:10400" | ||
# <<: [ *common ] | ||
|
||
# wyoming-microwakeword: | ||
# build: | ||
# context: ./microwakeword/ | ||
# ports: | ||
# - "10400:10400" | ||
# <<: [ *common ] | ||
|
||
# wyoming-rhasspy-speech: | ||
# build: | ||
# context: ./rhasspy-speech/ | ||
# ports: | ||
# - "10300:10300" | ||
# <<: [ *common ] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,81 @@ | ||
### YAML Anchors ### | ||
x-gpu: &gpu | ||
build: | ||
args: | ||
- BASE=nvidia/cuda:12.3.2-cudnn9-runtime-ubuntu22.04 | ||
runtime: nvidia | ||
deploy: | ||
resources: | ||
reservations: | ||
devices: | ||
- driver: nvidia | ||
count: all | ||
capabilities: ["compute", "utility", "graphics"] | ||
|
||
#### | ||
services: | ||
wyoming-piper: | ||
extends: | ||
file: docker-compose.base.yml | ||
service: wyoming-piper | ||
<<: [ *gpu ] | ||
build: | ||
dockerfile: GPU.Dockerfile | ||
args: | ||
- EXTRA_DEPENDENCIES=onnxruntime-gpu | ||
- RUN_SCRIPT=run-gpu.sh | ||
volumes: | ||
- ./piper/__main__.py:/opt/venv/lib/python3.11/site-packages/wyoming_piper/__main__.py | ||
- ./piper/process.py:/opt/venv/lib/python3.11/site-packages/wyoming_piper/process.py | ||
|
||
wyoming-whisper: | ||
extends: | ||
file: docker-compose.base.yml | ||
service: wyoming-whisper | ||
<<: [ *gpu ] | ||
command: [ "--model", "tiny-int8", "--language", "en", "--device", "cuda" ] | ||
|
||
# wyoming-whispercpp: | ||
# extends: | ||
# file: docker-compose.base.yml | ||
# service: wyoming-whispercpp | ||
# <<: [ *gpu ] | ||
# command: [ "--model", "tiny-int8", "--language", "en", "--device", "cuda" ] | ||
|
||
wyoming-openwakeword: | ||
extends: | ||
file: docker-compose.base.yml | ||
service: wyoming-openwakeword | ||
build: | ||
dockerfile: GPU.Dockerfile | ||
<<: [ *gpu ] | ||
|
||
# wyoming-porcupine: | ||
# extends: | ||
# file: docker-compose.base.yml | ||
# service: wyoming-porcupine | ||
# <<: [ *gpu ] | ||
|
||
# wyoming-snowboy: | ||
# extends: | ||
# file: docker-compose.base.yml | ||
# service: wyoming-snowboy | ||
# <<: [ *gpu ] | ||
|
||
# wyoming-vosk: | ||
# extends: | ||
# file: docker-compose.base.yml | ||
# service: wyoming-vosk | ||
# <<: [ *gpu ] | ||
|
||
# wyoming-microwakeword: | ||
# extends: | ||
# file: docker-compose.base.yml | ||
# service: wyoming-microwakeword | ||
# <<: [ *gpu ] | ||
|
||
# wyoming-rhasspy-speech: | ||
# extends: | ||
# file: docker-compose.base.yml | ||
# service: wyoming-rhasspy-speech | ||
# <<: [ *gpu ] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,6 @@ | ||
FROM debian:bookworm-slim | ||
ARG BASE=debian:bookworm-slim | ||
FROM $BASE | ||
|
||
ARG TARGETARCH | ||
ARG TARGETVARIANT | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,6 @@ | ||
FROM debian:bookworm-slim | ||
ARG BASE=debian:bookworm-slim | ||
FROM $BASE | ||
|
||
ARG TARGETARCH | ||
ARG TARGETVARIANT | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
ARG BASE=debian:bookworm-slim | ||
FROM $BASE | ||
|
||
ARG TARGETARCH | ||
ARG TARGETVARIANT | ||
|
||
# Install openWakeWord | ||
WORKDIR /usr/src | ||
ARG WYOMING_OPENWAKEWORD_VERSION='1.8.2' | ||
|
||
RUN \ | ||
apt-get update \ | ||
&& apt-get install -y --no-install-recommends \ | ||
python3 \ | ||
python3-pip \ | ||
python3-venv \ | ||
libopenblas0 \ | ||
\ | ||
&& python3 -m venv .venv \ | ||
&& .venv/bin/pip3 install --no-cache-dir uv \ | ||
&& .venv/bin/uv pip install --no-cache-dir -U \ | ||
setuptools \ | ||
wheel \ | ||
&& .venv/bin/uv pip install --no-cache-dir \ | ||
--exclude-newer 2023-12-12 \ | ||
"wyoming-openwakeword==${WYOMING_OPENWAKEWORD_VERSION}" \ | ||
\ | ||
&& rm -rf /var/lib/apt/lists/* | ||
|
||
WORKDIR / | ||
COPY run.sh ./ | ||
|
||
EXPOSE 10400 | ||
|
||
ENTRYPOINT ["bash", "/run.sh"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
ARG BASE=debian:bookworm-slim | ||
FROM $BASE | ||
|
||
ARG EXTRA_DEPENDENCIES | ||
ARG RUN_SCRIPT='run.sh' | ||
ARG TARGETARCH | ||
ARG TARGETVARIANT | ||
|
||
# Install Piper | ||
WORKDIR /usr/src | ||
ARG WYOMING_PIPER_VERSION='1.5.0' | ||
ARG BINARY_PIPER_VERSION='1.2.0' | ||
|
||
# Create and activate virtual environment | ||
ENV VIRTUAL_ENV=/opt/venv | ||
ENV PATH="$VIRTUAL_ENV/bin:$PATH" | ||
|
||
RUN \ | ||
apt-get update \ | ||
&& apt-get install -y --no-install-recommends \ | ||
wget \ | ||
curl \ | ||
python3 \ | ||
python3-pip \ | ||
python3-venv \ | ||
\ | ||
&& rm -rf /var/lib/apt/lists/* \ | ||
\ | ||
# Create virtual environment | ||
&& python3 -m venv $VIRTUAL_ENV | ||
|
||
RUN \ | ||
pip3 install --no-cache-dir -U \ | ||
setuptools \ | ||
wheel \ | ||
$EXTRA_DEPENDENCIES \ | ||
\ | ||
&& wget https://github.com/rhasspy/piper-phonemize/releases/download/v1.1.0/piper_phonemize-1.1.0-cp311-cp311-manylinux_2_28_x86_64.whl \ | ||
\ | ||
&& mv piper_phonemize-1.1.0-cp311-cp311-manylinux_2_28_x86_64.whl piper_phonemize-1.1.0-py3-none-any.whl \ | ||
\ | ||
&& pip3 install --no-cache-dir --force-reinstall --no-deps \ | ||
"piper-tts==${BINARY_PIPER_VERSION}" \ | ||
\ | ||
&& pip3 install --no-cache-dir --force-reinstall --no-deps \ | ||
piper_phonemize-1.1.0-py3-none-any.whl \ | ||
\ | ||
&& pip3 install --no-cache-dir \ | ||
"wyoming-piper @ https://github.com/rhasspy/wyoming-piper/archive/refs/tags/v${WYOMING_PIPER_VERSION}.tar.gz" \ | ||
\ | ||
&& rm -r piper_phonemize-1.1.0-py3-none-any.whl | ||
|
||
WORKDIR / | ||
COPY $RUN_SCRIPT ./ | ||
ENV RUN_SCRIPT_ENV="/${RUN_SCRIPT}" | ||
|
||
EXPOSE 10200 | ||
|
||
ENTRYPOINT ["bash", "-c", "exec $RUN_SCRIPT_ENV \"${@}\"", "--"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,231 @@ | ||
#!/usr/bin/env python3 | ||
import argparse | ||
import asyncio | ||
import json | ||
import logging | ||
from functools import partial | ||
from pathlib import Path | ||
from typing import Any, Dict, Set | ||
|
||
from wyoming.info import Attribution, Info, TtsProgram, TtsVoice, TtsVoiceSpeaker | ||
from wyoming.server import AsyncServer | ||
|
||
from . import __version__ | ||
from .download import find_voice, get_voices | ||
from .handler import PiperEventHandler | ||
from .process import PiperProcessManager | ||
|
||
_LOGGER = logging.getLogger(__name__) | ||
|
||
|
||
async def main() -> None: | ||
"""Main entry point.""" | ||
parser = argparse.ArgumentParser() | ||
parser.add_argument( | ||
"--piper", | ||
required=True, | ||
help="Path to piper executable", | ||
) | ||
parser.add_argument( | ||
"--voice", | ||
required=True, | ||
help="Default Piper voice to use (e.g., en_US-lessac-medium)", | ||
) | ||
parser.add_argument("--uri", default="stdio://", help="unix:// or tcp://") | ||
parser.add_argument( | ||
"--data-dir", | ||
required=True, | ||
action="append", | ||
help="Data directory to check for downloaded models", | ||
) | ||
parser.add_argument( | ||
"--download-dir", | ||
help="Directory to download voices into (default: first data dir)", | ||
) | ||
# | ||
parser.add_argument( | ||
"--speaker", type=str, help="Name or id of speaker for default voice" | ||
) | ||
parser.add_argument("--noise-scale", type=float, help="Generator noise") | ||
parser.add_argument("--length-scale", type=float, help="Phoneme length") | ||
parser.add_argument("--noise-w", type=float, help="Phoneme width noise") | ||
# | ||
parser.add_argument( | ||
"--auto-punctuation", default=".?!", help="Automatically add punctuation" | ||
) | ||
parser.add_argument("--samples-per-chunk", type=int, default=1024) | ||
parser.add_argument( | ||
"--max-piper-procs", | ||
type=int, | ||
default=1, | ||
help="Maximum number of piper process to run simultaneously (default: 1)", | ||
) | ||
# | ||
parser.add_argument( | ||
"--update-voices", | ||
action="store_true", | ||
help="Download latest voices.json during startup", | ||
) | ||
parser.add_argument( | ||
"--use-cuda", | ||
action="store_true", | ||
help="Use GPU" | ||
) | ||
# | ||
parser.add_argument("--debug", action="store_true", help="Log DEBUG messages") | ||
parser.add_argument( | ||
"--version", | ||
action="version", | ||
version=__version__, | ||
help="Print version and exit", | ||
) | ||
args = parser.parse_args() | ||
|
||
if not args.download_dir: | ||
# Default to first data directory | ||
args.download_dir = args.data_dir[0] | ||
|
||
logging.basicConfig(level=logging.DEBUG if args.debug else logging.INFO) | ||
_LOGGER.debug(args) | ||
|
||
# Load voice info | ||
voices_info = get_voices(args.download_dir, update_voices=args.update_voices) | ||
|
||
# Resolve aliases for backwards compatibility with old voice names | ||
aliases_info: Dict[str, Any] = {} | ||
for voice_info in voices_info.values(): | ||
for voice_alias in voice_info.get("aliases", []): | ||
aliases_info[voice_alias] = {"_is_alias": True, **voice_info} | ||
|
||
voices_info.update(aliases_info) | ||
voices = [ | ||
TtsVoice( | ||
name=voice_name, | ||
description=get_description(voice_info), | ||
attribution=Attribution( | ||
name="rhasspy", url="https://github.com/rhasspy/piper" | ||
), | ||
installed=True, | ||
version=None, | ||
languages=[ | ||
voice_info.get("language", {}).get( | ||
"code", | ||
voice_info.get("espeak", {}).get("voice", voice_name.split("_")[0]), | ||
) | ||
], | ||
speakers=[ | ||
TtsVoiceSpeaker(name=speaker_name) | ||
for speaker_name in voice_info["speaker_id_map"] | ||
] | ||
if voice_info.get("speaker_id_map") | ||
else None, | ||
) | ||
for voice_name, voice_info in voices_info.items() | ||
if not voice_info.get("_is_alias", False) | ||
] | ||
|
||
custom_voice_names: Set[str] = set() | ||
if args.voice not in voices_info: | ||
custom_voice_names.add(args.voice) | ||
|
||
for data_dir in args.data_dir: | ||
data_dir = Path(data_dir) | ||
if not data_dir.is_dir(): | ||
continue | ||
|
||
for onnx_path in data_dir.glob("*.onnx"): | ||
custom_voice_name = onnx_path.stem | ||
if custom_voice_name not in voices_info: | ||
custom_voice_names.add(custom_voice_name) | ||
|
||
for custom_voice_name in custom_voice_names: | ||
# Add custom voice info | ||
custom_voice_path, custom_config_path = find_voice( | ||
custom_voice_name, args.data_dir | ||
) | ||
with open(custom_config_path, "r", encoding="utf-8") as custom_config_file: | ||
custom_config = json.load(custom_config_file) | ||
custom_name = custom_config.get("dataset", custom_voice_path.stem) | ||
custom_quality = custom_config.get("audio", {}).get("quality") | ||
if custom_quality: | ||
description = f"{custom_name} ({custom_quality})" | ||
else: | ||
description = custom_name | ||
|
||
lang_code = custom_config.get("language", {}).get("code") | ||
if not lang_code: | ||
lang_code = custom_config.get("espeak", {}).get("voice") | ||
if not lang_code: | ||
lang_code = custom_voice_path.stem.split("_")[0] | ||
|
||
voices.append( | ||
TtsVoice( | ||
name=custom_name, | ||
description=description, | ||
version=None, | ||
attribution=Attribution(name="", url=""), | ||
installed=True, | ||
languages=[lang_code], | ||
) | ||
) | ||
|
||
wyoming_info = Info( | ||
tts=[ | ||
TtsProgram( | ||
name="piper", | ||
description="A fast, local, neural text to speech engine", | ||
attribution=Attribution( | ||
name="rhasspy", url="https://github.com/rhasspy/piper" | ||
), | ||
installed=True, | ||
voices=sorted(voices, key=lambda v: v.name), | ||
version=__version__, | ||
) | ||
], | ||
) | ||
|
||
process_manager = PiperProcessManager(args, voices_info) | ||
|
||
# Make sure default voice is loaded. | ||
# Other voices will be loaded on-demand. | ||
await process_manager.get_process() | ||
|
||
# Start server | ||
server = AsyncServer.from_uri(args.uri) | ||
|
||
_LOGGER.info("Ready") | ||
await server.run( | ||
partial( | ||
PiperEventHandler, | ||
wyoming_info, | ||
args, | ||
process_manager, | ||
) | ||
) | ||
|
||
|
||
# ----------------------------------------------------------------------------- | ||
|
||
|
||
def get_description(voice_info: Dict[str, Any]): | ||
"""Get a human readable description for a voice.""" | ||
name = voice_info["name"] | ||
name = " ".join(name.split("_")) | ||
quality = voice_info["quality"] | ||
|
||
return f"{name} ({quality})" | ||
|
||
|
||
# ----------------------------------------------------------------------------- | ||
|
||
|
||
def run(): | ||
asyncio.run(main()) | ||
|
||
|
||
if __name__ == "__main__": | ||
try: | ||
run() | ||
except KeyboardInterrupt: | ||
pass | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,175 @@ | ||
#!/usr/bin/env python3 | ||
import argparse | ||
import asyncio | ||
import json | ||
import logging | ||
import tempfile | ||
import time | ||
from dataclasses import dataclass | ||
from typing import Any, Dict, Optional | ||
|
||
from .download import ensure_voice_exists, find_voice | ||
|
||
_LOGGER = logging.getLogger(__name__) | ||
|
||
|
||
@dataclass | ||
class PiperProcess: | ||
"""Info for a running Piper process (one voice).""" | ||
|
||
name: str | ||
proc: "asyncio.subprocess.Process" | ||
config: Dict[str, Any] | ||
wav_dir: tempfile.TemporaryDirectory | ||
last_used: int = 0 | ||
|
||
def get_speaker_id(self, speaker: str) -> Optional[int]: | ||
"""Get speaker by name or id.""" | ||
return _get_speaker_id(self.config, speaker) | ||
|
||
@property | ||
def is_multispeaker(self) -> bool: | ||
"""True if model has more than one speaker.""" | ||
return _is_multispeaker(self.config) | ||
|
||
|
||
def _get_speaker_id(config: Dict[str, Any], speaker: str) -> Optional[int]: | ||
"""Get speaker by name or id.""" | ||
speaker_id_map = config.get("speaker_id_map", {}) | ||
speaker_id = speaker_id_map.get(speaker) | ||
if speaker_id is None: | ||
try: | ||
# Try to interpret as an id | ||
speaker_id = int(speaker) | ||
except ValueError: | ||
pass | ||
|
||
return speaker_id | ||
|
||
|
||
def _is_multispeaker(config: Dict[str, Any]) -> bool: | ||
"""True if model has more than one speaker.""" | ||
return config.get("num_speakers", 1) > 1 | ||
|
||
|
||
# ----------------------------------------------------------------------------- | ||
|
||
|
||
class PiperProcessManager: | ||
"""Manager of running Piper processes.""" | ||
|
||
def __init__(self, args: argparse.Namespace, voices_info: Dict[str, Any]): | ||
self.voices_info = voices_info | ||
self.args = args | ||
self.processes: Dict[str, PiperProcess] = {} | ||
self.processes_lock = asyncio.Lock() | ||
|
||
async def get_process(self, voice_name: Optional[str] = None) -> PiperProcess: | ||
"""Get a running Piper process or start a new one if necessary.""" | ||
voice_speaker: Optional[str] = None | ||
if voice_name is None: | ||
# Default voice | ||
voice_name = self.args.voice | ||
|
||
if voice_name == self.args.voice: | ||
# Default speaker | ||
voice_speaker = self.args.speaker | ||
|
||
assert voice_name is not None | ||
|
||
# Resolve alias | ||
voice_info = self.voices_info.get(voice_name, {}) | ||
voice_name = voice_info.get("key", voice_name) | ||
assert voice_name is not None | ||
|
||
piper_proc = self.processes.get(voice_name) | ||
if (piper_proc is None) or (piper_proc.proc.returncode is not None): | ||
# Remove if stopped | ||
self.processes.pop(voice_name, None) | ||
|
||
# Start new Piper process | ||
if self.args.max_piper_procs > 0: | ||
# Restrict number of running processes | ||
while len(self.processes) >= self.args.max_piper_procs: | ||
# Stop least recently used process | ||
lru_proc_name, lru_proc = sorted( | ||
self.processes.items(), key=lambda kv: kv[1].last_used | ||
)[0] | ||
_LOGGER.debug("Stopping process for: %s", lru_proc_name) | ||
self.processes.pop(lru_proc_name, None) | ||
if lru_proc.proc.returncode is None: | ||
try: | ||
lru_proc.proc.terminate() | ||
await lru_proc.proc.wait() | ||
except Exception: | ||
_LOGGER.exception("Unexpected error stopping piper process") | ||
|
||
_LOGGER.debug( | ||
"Starting process for: %s (%s/%s)", | ||
voice_name, | ||
len(self.processes) + 1, | ||
self.args.max_piper_procs, | ||
) | ||
|
||
ensure_voice_exists( | ||
voice_name, | ||
self.args.data_dir, | ||
self.args.download_dir, | ||
self.voices_info, | ||
) | ||
|
||
onnx_path, config_path = find_voice(voice_name, self.args.data_dir) | ||
with open(config_path, "r", encoding="utf-8") as config_file: | ||
config = json.load(config_file) | ||
|
||
wav_dir = tempfile.TemporaryDirectory() | ||
piper_args = [ | ||
"--model", | ||
str(onnx_path), | ||
"--config", | ||
str(config_path), | ||
"--output_dir", | ||
str(wav_dir.name), | ||
"--json-input", # piper 1.1+ | ||
] | ||
|
||
if voice_speaker is not None: | ||
if _is_multispeaker(config): | ||
speaker_id = _get_speaker_id(config, voice_speaker) | ||
if speaker_id is not None: | ||
piper_args.extend(["--speaker", str(speaker_id)]) | ||
|
||
if self.args.noise_scale: | ||
piper_args.extend(["--noise-scale", str(self.args.noise_scale)]) | ||
|
||
if self.args.length_scale: | ||
piper_args.extend(["--length-scale", str(self.args.length_scale)]) | ||
|
||
if self.args.noise_w: | ||
piper_args.extend(["--noise-w", str(self.args.noise_w)]) | ||
|
||
if self.args.use_cuda: | ||
piper_args.extend(["--use-cuda"]) | ||
|
||
_LOGGER.debug( | ||
"Starting piper process: %s args=%s", self.args.piper, piper_args | ||
) | ||
piper_proc = PiperProcess( | ||
name=voice_name, | ||
proc=await asyncio.create_subprocess_exec( | ||
self.args.piper, | ||
*piper_args, | ||
stdin=asyncio.subprocess.PIPE, | ||
stdout=asyncio.subprocess.PIPE, | ||
stderr=asyncio.subprocess.DEVNULL, | ||
), | ||
config=config, | ||
wav_dir=wav_dir, | ||
) | ||
self.processes[voice_name] = piper_proc | ||
|
||
# Update used | ||
piper_proc.last_used = time.monotonic_ns() | ||
|
||
return piper_proc | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
#!/usr/bin/env bash | ||
python3 -m wyoming_piper \ | ||
--piper 'piper' \ | ||
--use-cuda \ | ||
--uri 'tcp://0.0.0.0:10200' \ | ||
--data-dir /data \ | ||
--download-dir /data "$@" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,6 @@ | ||
FROM debian:bookworm-slim | ||
ARG BASE=debian:bookworm-slim | ||
FROM $BASE | ||
|
||
ARG TARGETARCH | ||
ARG TARGETVARIANT | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,6 @@ | ||
FROM debian:bookworm-slim | ||
ARG BASE=debian:bookworm-slim | ||
FROM $BASE | ||
|
||
ARG TARGETARCH | ||
ARG TARGETVARIANT | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,6 @@ | ||
FROM debian:bookworm-slim | ||
ARG BASE=debian:bookworm-slim | ||
FROM $BASE | ||
|
||
ARG TARGETARCH | ||
ARG TARGETVARIANT | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,6 @@ | ||
FROM debian:bookworm-slim | ||
ARG BASE=debian:bookworm-slim | ||
FROM $BASE | ||
|
||
ARG TARGETARCH | ||
ARG TARGETVARIANT | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we reference documentation on how to setup docker for gpu? (I can of course add it in a seperate pr)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good idea!