feat(subtitles): add Netflix official bilingual captions#1751
feat(subtitles): add Netflix official bilingual captions#1751semantic-craft wants to merge 2 commits into
Conversation
🦋 Changeset detectedLatest commit: 6f0e010 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Contributor trust score16/100 — New contributor This score estimates contributor familiarity with Outcome
Score breakdown
Signals used
Policy
Updated automatically when the PR changes or when a maintainer reruns the workflow. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 87a9eec38d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
| export default defineContentScript({ | ||
| matches: ["*://*.youtube.com/*", "*://*.youtube-nocookie.com/*"], | ||
| matches: ["*://*.youtube.com/*", "*://*.youtube-nocookie.com/*", "*://*.netflix.com/*"], |
There was a problem hiding this comment.
Avoid initializing Netflix subtitles on browse pages
Because this content script now runs on every Netflix page, opening netflix.com/browse first sets __READ_FROG_SUBTITLES_INJECTED__ and calls the Netflix initializer before any video exists; initializeScheduler() then times out with no scheduler, and the Netflix config has no navigation events to retry when the SPA later moves into /watch/.... In that normal browse→play flow, the injected flag prevents reinitialization and the floating button/subtitle overlay never attaches to the actual video.
Useful? React with 👍 / 👎.
| private async waitForTracks(): Promise<StreamingTrack[]> { | ||
| window.postMessage({ type: STREAMING_ENSURE_NATIVE_SUBTITLES_TYPE }, window.location.origin) | ||
| if (tracksByUrl.size > 0) | ||
| return [...tracksByUrl.values()] |
There was a problem hiding this comment.
Scope discovered tracks to the current Netflix title
When a Netflix tab moves from one /watch/... title to another and subtitles are started again, this early return reuses whatever was discovered for the previous title because tracksByUrl/capturesByUrl are module-level and never cleared or scoped by the current pathname. That lets fetch() select and fetch old subtitle URLs before the new manifest has been intercepted, so the next episode/movie can display stale captions.
Useful? React with 👍 / 👎.
| function selectTargetTrack(tracks: StreamingTrack[], targetCode: LangCodeISO6393): StreamingTrack | null { | ||
| return tracks.find((track) => { | ||
| const resolved = resolveLanguageCodeFromLocale(track.language ?? track.label ?? "") | ||
| return resolved === targetCode || (isChineseCode(targetCode) && /zh|中文|Chinese/i.test(`${track.language ?? ""} ${track.label ?? ""}`)) |
There was a problem hiding this comment.
Don't match every Chinese variant as the target
For Chinese target codes, this fallback treats any Chinese-looking track as a match even when resolveLanguageCodeFromLocale resolved a different variant. If Netflix lists zh-Hans before zh-Hant (or Mandarin before Cantonese), a user configured for cmn-Hant or yue will receive the first Chinese subtitle track instead of the configured target; the broad label fallback should only run when the variant cannot be resolved, or exact variants should be ranked first.
Useful? React with 👍 / 👎.
|
I feel like if we want to support more streaming services like Netflix. We need a systematic adapter design. I didn't look at this PR carefully but I think it may not be the systematic way to add a supported website. If so, @taiiiyang yiu should close this PR and we should come up with an more systematic design in the future |
|
Thanks, I agree that streaming sites should not grow as one-off implementations. I pushed
On the design concern: this PR is intended as a narrow first streaming implementation, not a new parallel subtitle UI. It already goes through the existing seam:
I agree the next streaming service should probably come after a small streaming platform registry/extraction, rather than adding another host directly in this PR. Max/HBO looks like a useful follow-up validation target for the generic discovery approach, but I would keep it out of this PR unless maintainers prefer to see the abstraction first. Local validation:
|
|
One more product-context note from my side: I have personally been a paid Immersive Translate subscriber, and the main reason I paid for it was bilingual subtitles. So I think official bilingual streaming subtitles are a genuinely competitive direction for Read Frog, not just a niche implementation detail. My expectation is that heavy Immersive Translate users are also likely to have this need: watching streaming video with bilingual subtitles, especially when they are learning from official localized subtitle assets instead of only machine-translated captions. I can help validate this systematically because I have subscriptions for Paramount and HBO/Max as well, and can also look at Disney+. I do not think those should be added into this PR one by one. A better next step may be to turn Netflix into the first verified implementation, then test Paramount, HBO/Max, and Disney+ to identify the common adapter boundary: track discovery, native subtitle hiding, current-title scoping, subtitle format parsing, and platform navigation behavior. That would let us decide whether the right follow-up is a small streaming-platform adapter registry/design, rather than guessing from Netflix alone. |
|
Closing in favor of #1757, which is the systematic adapter framework you asked for here — Netflix is the first adapter, and HBO Max already works as a second adapter on the same framework (a ~386-line delta), so the design is proven to generalize rather than being a one-off. Thanks for the steer toward a systematic design. |
|
Withdrawing this official-track approach in favor of #1758. This PR's goal was official Chinese subtitle + official English subtitle comparison. I still want to support that later, but the full official-track alignment path adds enough parsing, matching, and provider-specific code that compressing it under the current small-PR / sub-1000-line review target would make the implementation incomplete. The replacement PR first lands the smaller, repeatedly locally tested streaming-caption path: capture the player-rendered English cue and translate it through Read Frog's existing pipeline, similar in user-facing behavior to Immersive Translate. That generic adapter shape has been tested locally on Netflix and HBO Max; #1758 only enables Netflix so the diff stays small and reviewable. |
Summary
This PR adds a focused Netflix bilingual subtitle path that pairs Netflix's official English captions with Netflix's official target-language subtitles.
When both official tracks are available, Read Frog no longer has to rely on machine translation for the localized line. It can show the platform-provided original and platform-provided translation together as a bilingual subtitle pair.
This focused version is intentionally kept under the repository's new-contributor auto-close threshold. A broader streaming-platform implementation can be split out separately after this core Netflix path is reviewed.
Product advantage
This is the key product difference from machine-translation-only subtitle extensions.
A well-known extension such as Immersive Translate commonly handles video bilingual subtitles by selecting the English subtitle and generating a translated line from it. That is useful when no official translation exists, but it cannot provide a perfect official-to-official comparison: the localized subtitle is produced by a translation provider, not by the streaming platform's official subtitle asset.
Read Frog can now do better on Netflix when both tracks are present. It keeps the official localized subtitle track, captures the official English caption track separately, and displays them together. This gives learners an official original / official translation comparison instead of English plus a machine-generated translation.
Usage model
On Netflix, the user should select the official target-language subtitle track in Netflix's own subtitle menu first. Read Frog then discovers the matching official English caption track separately and renders the two official tracks together.
That is intentionally different from the machine-translation workflow used by extensions such as Immersive Translate, where the user typically selects the English original subtitle and the extension generates the translated line. Here, the localized line can remain Netflix's official subtitle.
Why this was hard
Netflix official subtitle tracks are not segmented the same way across languages. English may split one sentence across multiple cues while Chinese Traditional uses one cue, or the reverse. A naive timestamp join can truncate English, duplicate lines, or attach text to the wrong localized cue.
This PR uses the target-language subtitle timing as the display baseline and assigns each English cue to the localized cue with the largest time overlap. That preserves the official localized timing while keeping the English line complete.
Implementation notes
@read-frog/extension.Validation
pnpm type-checkpnpm exec eslintSKIP_FREE_API=true pnpm test— 162 files passed, 1386 tests passedWXT_SKIP_ENV_VALIDATION=true pnpm exec wxt build