feat(subtitles): add official bilingual streaming captions#1749
feat(subtitles): add official bilingual streaming captions#1749semantic-craft wants to merge 2 commits into
Conversation
🦋 Changeset detectedLatest commit: cf89f0a The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Contributor trust score19/100 — New contributor This score estimates contributor familiarity with Outcome
Score breakdown
Signals used
Policy
Updated automatically when the PR changes or when a maintainer reruns the workflow. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cf89f0ae03
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| return | ||
|
|
||
| const normalizedUrl = normalizeUrl(url) | ||
| if (!normalizedUrl || !looksLikeSubtitleUrl(normalizedUrl)) |
There was a problem hiding this comment.
Capture known subtitle track URLs without extensions
On Netflix the manifest URLs collected from ttDownloadables are often CDN paths such as /nflxvideo/... with no .vtt/caption token; the collector accepts those because the key is downloadUrls. In that case this guard drops the response even though trackByUrl already knows it is a subtitle, and the fetch hook uses the same predicate before calling captureSubtitle, so if backgroundFetch cannot replay the signed/authenticated request, waitForStreamingSubtitleCapture times out and official subtitles are lost. Accept normalized URLs present in trackByUrl as captures.
Useful? React with 👍 / 👎.
| playerContainer: "body", | ||
| nativeSubtitles: STREAMING_NATIVE_SUBTITLES_SELECTOR, | ||
| }, | ||
| events: {}, |
There was a problem hiding this comment.
Reset the streaming adapter on SPA route changes
For streaming sites this config supplies no navigateStart/navigateFinish events, and the only notifyNavigation() call in the repo is the YouTube path, so UniversalVideoAdapter never runs resetForNavigation() when Netflix/Max/Disney push a new /watch/... URL without a reload. In that SPA navigation scenario the scheduler and session caches stay bound to the previous video, so active overlays can continue to show stale subtitles until a page reload or manual reset. Add history/webNavigation or platform events for this adapter.
Useful? React with 👍 / 👎.
| targetCode: LangCodeISO6393, | ||
| ): StreamingSubtitleTrack | null { | ||
| return getOfficialTrackCandidates(tracks).find(track => | ||
| resolveTrackLanguage(track) === targetCode || (isChineseCode(targetCode) && isChineseTrack(track))) ?? null |
There was a problem hiding this comment.
Prefer an exact Chinese subtitle variant first
When a title exposes both Simplified and Traditional Chinese, this .find() returns the first Chinese track for any Chinese target because isChineseCode(targetCode) && isChineseTrack(track) is true before later entries can be checked for an exact resolveTrackLanguage(track) === targetCode match. If the manifest order is zh-Hant then zh-Hans and the user target is cmn, the official overlay shows Traditional text. Do an exact-match pass before the generic Chinese fallback.
Useful? React with 👍 / 👎.
|
Withdrawing this official-track approach in favor of #1758. This PR's goal was official Chinese subtitle + official English subtitle comparison. I still want to support that later, but the full official-track alignment path adds enough parsing, matching, and provider-specific code that compressing it under the current small-PR / sub-1000-line review target would make the implementation incomplete. The replacement PR first lands the smaller, repeatedly locally tested streaming-caption path: capture the player-rendered English cue and translate it through Read Frog's existing pipeline, similar in user-facing behavior to Immersive Translate. That generic adapter shape has been tested locally on Netflix and HBO Max; #1758 only enables Netflix so the diff stays small and reviewable. |
Summary
This PR adds a streaming bilingual subtitle path for Netflix-style players. When both official source and target subtitle tracks are available, Read Frog now renders the official original subtitle together with the official localized subtitle instead of always falling back to machine translation.
Concretely, this adds support for:
Product advantage
The main product difference is that this is not just another machine-translated subtitle overlay.
A well-known extension such as Immersive Translate usually solves video bilingual subtitles by taking the selected English subtitle and generating a machine translation. That is useful when no official translation exists, but it cannot provide a perfect official-to-official comparison: the localized line is generated by the translation provider, not taken from the streaming platform's official subtitle asset.
Read Frog can now do better on Netflix when both tracks are present. It keeps the official Chinese Traditional subtitle selected in Netflix, captures the official English caption track separately, and renders them together as a matched bilingual pair. In other words, the viewer gets official English above official Chinese, rather than English plus a machine-generated Chinese translation. This is a stronger experience for language learning because users can compare the original wording with the platform-approved localization.
Why this was hard
Official subtitle tracks are not segmented the same way across languages. Netflix may split one English sentence across multiple cues while the Chinese subtitle uses a single cue, or the reverse. A simple timestamp overlap would either truncate the English line, duplicate translations, or attach a source cue to the wrong localized line.
The implementation addresses this by using the target-language subtitle timing as the display baseline and assigning each source cue to the target cue with the largest time overlap. That lets Read Frog display complete official English lines above the corresponding official Chinese subtitle, even when the underlying tracks use different cue boundaries.
Implementation notes
Validation
pnpm type-checkpnpm exec eslintSKIP_FREE_API=true pnpm vitest run src/utils/subtitles/__tests__/streaming-fetcher.test.ts src/entrypoints/interceptor.content/__tests__/streaming-subtitles-interceptor.test.ts src/entrypoints/subtitles.content/__tests__/universal-adapter.test.ts src/entrypoints/subtitles.content/__tests__/subtitles-scheduler.test.tsSKIP_FREE_API=true pnpm test— 166 files passed, 1417 tests passedWXT_SKIP_ENV_VALIDATION=true pnpm exec wxt build