Skip to content

mofa-fm: pre-validate voice against ominix-api /v1/voices#48

Open
ymote wants to merge 1 commit into
mainfrom
fix/validate-voice-against-ominix
Open

mofa-fm: pre-validate voice against ominix-api /v1/voices#48
ymote wants to merge 1 commit into
mainfrom
fix/validate-voice-against-ominix

Conversation

@ymote

@ymote ymote commented Apr 24, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • fm_tts now calls GET /v1/voices before hitting /v1/audio/tts/qwen3 and rejects unknown voices up-front (with the list of available voices and a hint to use fm_voice_save).
  • fm_voice_list intersects the local catalog with /v1/voices, annotates each entry as registered / orphaned_in_catalog / ominix_only, and appends a summary like "N registered, M orphaned".
  • Both paths degrade gracefully when /v1/voices is unreachable — fm_tts falls through to the TTS call, fm_voice_list emits a warning banner but still returns the catalog.

Symptom this fixes

On octos mini2 a user ran fm_voice_list and saw yangmi + douwentao (saved in mofa-fm's local voices.json). fm_tts({voice: "yangmi"}) succeeded, but the audio was clearly not yangmi. Root cause: ominix-api's ~/.OminiX/models/voices.json on mini2 was empty (/v1/voices returned {"voices": []}). /v1/audio/tts/qwen3 silently substituted a default preset, so the tool reported success but delivered the wrong voice. After this change, the mismatch is surfaced before any synthesis happens, and fm_voice_list flags the offending entry as orphaned_in_catalog.

Test plan

  • cargo test -p mofa-fm — 21 tests pass (7 pre-existing + 14 new).
  • cargo build --release -p mofa-fm — clean.
  • cargo fmt --all -- --check — clean.
  • New tests cover:
    • should_pre_validate_voice_and_reject_unknown
    • should_pass_through_when_voice_registered
    • should_pass_through_when_voice_registered_case_insensitive
    • should_allow_empty_voice_as_server_default
    • should_list_available_when_registry_is_empty
    • should_fall_through_when_voices_endpoint_unreachable (uses a closed TCP port so no HTTP server is needed in CI)
    • should_intersect_catalog_with_registered_in_fm_voice_list
    • should_classify_empty_registered_as_all_orphaned
    • should_match_voices_case_insensitively_in_classifier
    • should_parse_voices_response_and_include_aliases
    • should_handle_empty_voices_response
    • should_error_on_missing_voices_key
    • should_error_on_invalid_json
    • voice_in_registry_is_case_insensitive
  • Manual mini2 verification after redeploying the skill binary: fm_voice_list should now annotate yangmi as orphaned_in_catalog, and fm_tts({voice: "yangmi"}) should return the "not registered" error instead of silently producing the wrong voice.

Constraints respected

  • External binary protocol unchanged (./skill_binary <tool_name> with JSON stdin/stdout).
  • Input schema unchanged — no renamed or removed fields.
  • ominix-api stays an optional dependency of fm_voice_list; warnings surface but the call still succeeds.

fm_tts and fm_voice_list now cross-check mofa-fm's local catalog
against what ominix-api will actually synthesise.

fm_tts: before POSTing to /v1/audio/tts/qwen3, fetch /v1/voices. If
the requested voice isn't registered (and isn't the empty
server-default), return an error up-front listing the available
voices. This plugs the mini2 yangmi symptom: fm-fm had yangmi locally,
ominix-api had no voices.json entries, and /v1/audio/tts/qwen3
silently substituted a different preset so the user heard the wrong
voice. When /v1/voices is unreachable, fall through to the TTS call
(better to try than block on a transient outage).

fm_voice_list: intersect catalog with /v1/voices and annotate each
entry as registered, orphaned_in_catalog, or ominix_only. Append a
summary like "N registered, M orphaned (not synth-capable on this
ominix-api)". If /v1/voices is unreachable, emit a warning banner but
still render the catalog.

Pure helpers (parse_registered_voices, validate_requested_voice,
classify_voice_entries, voice_in_registry) are unit tested; the
unreachable path uses a closed TCP port to exercise graceful
degradation without standing up an HTTP server.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant