Skip to content

Add Linux TTS support (espeak/spd-say)#1

Open
mikkel wants to merge 1 commit into
difflabai:masterfrom
mikkel:linux-tts-support
Open

Add Linux TTS support (espeak/spd-say)#1
mikkel wants to merge 1 commit into
difflabai:masterfrom
mikkel:linux-tts-support

Conversation

@mikkel
Copy link
Copy Markdown

@mikkel mikkel commented Mar 19, 2026

Summary

  • Add platform-aware TTS in coach.py: uses macOS say on Darwin, espeak or spd-say on Linux, silently skips if none available
  • Update plank.sh with a speak() wrapper that picks the right TTS command per platform
  • Update README with Linux TTS install instructions

The existing macOS say path is unchanged — Linux detection only activates on non-Darwin platforms.

Test plan

  • Run on macOS — verify say still works as before
  • Run on Linux with espeak installed — verify voice cues play
  • Run on Linux with spd-say installed — verify fallback works
  • Run on Linux with no TTS — verify silent graceful degradation

🤖 Generated with Claude Code

Voice coaching was hardcoded to macOS `say`. This adds platform
detection so Linux users get TTS via `espeak` or `spd-say` while
preserving the existing macOS path unchanged.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the application's accessibility by introducing Text-to-Speech (TTS) capabilities for Linux users, mirroring the existing functionality for macOS. It refactors the voice coaching logic to intelligently detect the operating system and available TTS engines, providing a seamless experience across platforms while gracefully handling cases where no TTS is installed. This change broadens the user base who can benefit from audible coaching cues.

Highlights

  • Linux TTS Support: Added cross-platform Text-to-Speech (TTS) support, extending voice coaching cues to Linux using espeak or spd-say.
  • Platform-Aware TTS: Implemented a platform-aware speak() wrapper in plank.sh to dynamically select the correct TTS command.
  • Documentation Update: Updated the README.md with clear instructions for installing Linux TTS engines.
  • Graceful Degradation: Ensured graceful degradation by silently skipping voice cues if no TTS engine is available on the system.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds text-to-speech support for Linux using espeak or spd-say, complementing the existing macOS support. The changes in coach.py introduce platform detection for TTS commands, and plank.sh is updated with a wrapper for cross-platform speech. The README is also updated accordingly.

My review found a critical bug in coach.py that would cause a NameError at runtime, along with a suggestion to use a more robust standard library function. I also identified a minor logic issue in the new say_sync behavior and suggested a refactoring for the plank.sh script to improve efficiency and robustness. Overall, a great addition with a few areas for refinement.

Comment thread coach.py
Comment on lines +164 to +178
def _which(cmd: str) -> bool:
"""Check whether *cmd* is on PATH (cached)."""
if cmd not in _which_cache:
_which_cache[cmd] = (
subprocess.call(
["which", cmd],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
)
== 0
)
return _which_cache[cmd]


_which_cache: dict[str, bool] = {}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

There are two issues in this block:

  1. _which_cache is used in _which before it is defined, which will cause a NameError at runtime.
  2. The implementation of _which using subprocess.call(['which', ...]) is not robust. If the which command is not available on the system, it will raise an unhandled FileNotFoundError.

I suggest moving the definition of _which_cache before _which and using the more robust and standard shutil.which to find executables. This will fix the bug and make the code cleaner.

Note: You'll need to add import shutil at the top of the file for this suggestion to work.

_which_cache: dict[str, bool] = {}


def _which(cmd: str) -> bool:
    """Check whether *cmd* is on PATH (cached)."""
    if cmd not in _which_cache:
        _which_cache[cmd] = shutil.which(cmd) is not None
    return _which_cache[cmd]

Comment thread coach.py
Comment on lines +202 to +205
if not cmd:
if wait > 0:
time.sleep(wait)
return
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This logic causes say_sync to wait for wait seconds even if no TTS command is available and nothing is spoken. The docstring says 'optionally wait after it finishes', which implies that if speech doesn't happen, the wait should also be skipped. This could lead to unexpected delays if say_sync is called with a wait value on a system without TTS. This behavior is also inconsistent with the previous implementation which would not wait if the say command was not found.

    if not cmd:
        return

Comment thread plank.sh
Comment on lines +4 to +12
speak() {
if command -v say &>/dev/null; then
say "$1"
elif command -v espeak &>/dev/null; then
espeak "$1"
elif command -v spd-say &>/dev/null; then
spd-say "$1"
fi
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This implementation has two areas for improvement:

  1. It only passes the first argument ($1) to the TTS command. Using "$@" would make it more robust by handling all arguments.
  2. It checks for the existence of say, espeak, and spd-say on every call. It would be more efficient to perform this check only once when the script starts.

Here is a suggested refactoring that addresses both points by finding the appropriate TTS command once and defining a speak function that uses it.

Suggested change
speak() {
if command -v say &>/dev/null; then
say "$1"
elif command -v espeak &>/dev/null; then
espeak "$1"
elif command -v spd-say &>/dev/null; then
spd-say "$1"
fi
}
_tts_command=""
if command -v say &>/dev/null; then
_tts_command="say"
elif command -v espeak &>/dev/null; then
_tts_command="espeak"
elif command -v spd-say &>/dev/null; then
_tts_command="spd-say"
fi
speak() {
if [ -n "$_tts_command" ]; then
"$_tts_command" "$@"
fi
}

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a531898256

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread coach.py
Comment on lines +168 to +171
subprocess.call(
["which", cmd],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid shelling out to which on unsupported platforms

This helper now invokes which before say()/say_sync() reach their FileNotFoundError guard. On Windows, which itself is not present, so _tts_cmd() raises on the first voice cue and the app crashes instead of silently skipping speech as it did before this change. Using shutil.which() or catching lookup failures inside _which() would preserve the advertised no-TTS fallback.

Useful? React with 👍 / 👎.

Comment thread coach.py
Comment on lines +158 to +160
for cmd in ("espeak", "spd-say"):
if _which(cmd):
return [cmd, text]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Pass -w to spd-say in the synchronous path

When spd-say is the only Linux backend, say_sync() no longer blocks until the prompt finishes. The spd-say(1) docs say -w, --wait will "Wait till the message is spoken or discarded" (https://manpages.debian.org/testing/speech-dispatcher/spd-say.1.en.html), but _tts_cmd() currently returns only ['spd-say', text]. In timed_hold(), that makes the 3-2-1 countdown start while "Get in position" is still being spoken, so timed sets begin early on those machines.

Useful? React with 👍 / 👎.

Comment thread plank.sh
Comment on lines +9 to +10
elif command -v spd-say &>/dev/null; then
spd-say "$1"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Make plank.sh wait for spd-say before sleeping

The new spd-say branch changes this script's timing because spd-say returns as soon as it submits the utterance unless -w/--wait is passed; spd-say(1) documents that flag as "Wait till the message is spoken or discarded" (https://manpages.debian.org/testing/speech-dispatcher/spd-say.1.en.html). On Linux systems that have spd-say but not espeak, the following sleep 3, sleep 40, and sleep 60 start before the cue finishes, so spoken prompts overlap the countdown and the hold/rest intervals become shorter than advertised.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Author

@mikkel mikkel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@/ml2/nanobot/.pr-review-state/review-body.tmp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant