Skip to content

feat(subtitles): add Netflix official bilingual captions#1751

Closed
semantic-craft wants to merge 2 commits into
mengxi-ream:mainfrom
semantic-craft:feat/netflix-official-subtitles-small
Closed

feat(subtitles): add Netflix official bilingual captions#1751
semantic-craft wants to merge 2 commits into
mengxi-ream:mainfrom
semantic-craft:feat/netflix-official-subtitles-small

Conversation

@semantic-craft

@semantic-craft semantic-craft commented Jun 27, 2026

Copy link
Copy Markdown

Summary

This PR adds a focused Netflix bilingual subtitle path that pairs Netflix's official English captions with Netflix's official target-language subtitles.

When both official tracks are available, Read Frog no longer has to rely on machine translation for the localized line. It can show the platform-provided original and platform-provided translation together as a bilingual subtitle pair.

This focused version is intentionally kept under the repository's new-contributor auto-close threshold. A broader streaming-platform implementation can be split out separately after this core Netflix path is reviewed.

Product advantage

This is the key product difference from machine-translation-only subtitle extensions.

A well-known extension such as Immersive Translate commonly handles video bilingual subtitles by selecting the English subtitle and generating a translated line from it. That is useful when no official translation exists, but it cannot provide a perfect official-to-official comparison: the localized subtitle is produced by a translation provider, not by the streaming platform's official subtitle asset.

Read Frog can now do better on Netflix when both tracks are present. It keeps the official localized subtitle track, captures the official English caption track separately, and displays them together. This gives learners an official original / official translation comparison instead of English plus a machine-generated translation.

Usage model

On Netflix, the user should select the official target-language subtitle track in Netflix's own subtitle menu first. Read Frog then discovers the matching official English caption track separately and renders the two official tracks together.

That is intentionally different from the machine-translation workflow used by extensions such as Immersive Translate, where the user typically selects the English original subtitle and the extension generates the translated line. Here, the localized line can remain Netflix's official subtitle.

Why this was hard

Netflix official subtitle tracks are not segmented the same way across languages. English may split one sentence across multiple cues while Chinese Traditional uses one cue, or the reverse. A naive timestamp join can truncate English, duplicate lines, or attach text to the wrong localized cue.

This PR uses the target-language subtitle timing as the display baseline and assigns each English cue to the localized cue with the largest time overlap. That preserves the official localized timing while keeping the English line complete.

Implementation notes

  • Adds a Netflix-only content-script path for the existing video subtitles UI.
  • Adds a main-world interceptor that discovers Netflix official subtitle tracks from manifest-like JSON and captures matching subtitle responses.
  • Adds a Netflix subtitles fetcher that selects official English captions plus the configured target-language track, parses WebVTT/TTML timing, and aligns by maximum overlap.
  • Uses the existing subtitle display settings, so the current user setting for translation position is preserved.
  • Adds a patch changeset for @read-frog/extension.

Validation

  • pnpm type-check
  • pnpm exec eslint
  • SKIP_FREE_API=true pnpm test — 162 files passed, 1386 tests passed
  • WXT_SKIP_ENV_VALIDATION=true pnpm exec wxt build

@changeset-bot

changeset-bot Bot commented Jun 27, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: 6f0e010

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@read-frog/extension Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@github-actions github-actions Bot added feat contrib-trust:new PR author trust score is 0-29. needs-maintainer-review Contributor trust automation recommends maintainer review. labels Jun 27, 2026
@github-actions

github-actions Bot commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

Contributor trust score

16/100 — New contributor

This score estimates contributor familiarity with mengxi-ream/read-frog using public GitHub signals. It is advisory only and does not block merges automatically.

Outcome

Score breakdown

Dimension Score Signals
Repo familiarity 0/35 commits in repo, merged PRs, reviews
Community standing 6/25 account age, followers, repo role
OSS influence 8/20 stars on owned non-fork repositories
PR track record 2/20 merge rate across resolved PRs in this repo

Signals used

  • Repo commits: 0 (author commits reachable from the repository default branch)
  • Repo PR history: merged 0, open 1, closed-unmerged 1
  • Repo reviews: 0
  • PR counted changed lines: 825 (+794 / -31)
  • Repo permission: read
  • Followers: 2
  • Account age: 129 months
  • Owned non-fork repos considered: max 102, total 208 (semantic-craft/mac-tmux-kit (102), semantic-craft/iOS-vibebuddy (100), semantic-craft/raycast-doubao-tts (3), semantic-craft/vscode-ai-voice-studio (1), semantic-craft/Raycast-Gemini-TTS (1), semantic-craft/snapocr-via-paddle (1), semantic-craft/famo-releases (0), semantic-craft/famoime-mac (0), semantic-craft/responsay-legal-skills (0), semantic-craft/responsay-releases (0), semantic-craft/tmux-cheatsheet (0), semantic-craft/raycast-responsay (0), semantic-craft/parlance (0), semantic-craft/vscode-english-coach (0), semantic-craft/raycast-skill-manager (0), semantic-craft/Raycast-Mimo-TTS (0), semantic-craft/raycast-tmux (0), semantic-craft/raycast-ai-voice-studio (0), semantic-craft/english-speaking-training-vscode (0), semantic-craft/raycast-ai-translate (0))

Policy

  • Low-score review threshold: < 30
  • Auto-close: score < 20 and counted changed lines > 1000
  • Migration-related files are excluded from the auto-close line count
  • Policy version: v1.2

Updated automatically when the PR changes or when a maintainer reruns the workflow.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 87a9eec38d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".


export default defineContentScript({
matches: ["*://*.youtube.com/*", "*://*.youtube-nocookie.com/*"],
matches: ["*://*.youtube.com/*", "*://*.youtube-nocookie.com/*", "*://*.netflix.com/*"],

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid initializing Netflix subtitles on browse pages

Because this content script now runs on every Netflix page, opening netflix.com/browse first sets __READ_FROG_SUBTITLES_INJECTED__ and calls the Netflix initializer before any video exists; initializeScheduler() then times out with no scheduler, and the Netflix config has no navigation events to retry when the SPA later moves into /watch/.... In that normal browse→play flow, the injected flag prevents reinitialization and the floating button/subtitle overlay never attaches to the actual video.

Useful? React with 👍 / 👎.

Comment thread src/utils/subtitles/fetchers/netflix.ts Outdated
private async waitForTracks(): Promise<StreamingTrack[]> {
window.postMessage({ type: STREAMING_ENSURE_NATIVE_SUBTITLES_TYPE }, window.location.origin)
if (tracksByUrl.size > 0)
return [...tracksByUrl.values()]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Scope discovered tracks to the current Netflix title

When a Netflix tab moves from one /watch/... title to another and subtitles are started again, this early return reuses whatever was discovered for the previous title because tracksByUrl/capturesByUrl are module-level and never cleared or scoped by the current pathname. That lets fetch() select and fetch old subtitle URLs before the new manifest has been intercepted, so the next episode/movie can display stale captions.

Useful? React with 👍 / 👎.

Comment thread src/utils/subtitles/fetchers/netflix.ts Outdated
function selectTargetTrack(tracks: StreamingTrack[], targetCode: LangCodeISO6393): StreamingTrack | null {
return tracks.find((track) => {
const resolved = resolveLanguageCodeFromLocale(track.language ?? track.label ?? "")
return resolved === targetCode || (isChineseCode(targetCode) && /zh|中文|Chinese/i.test(`${track.language ?? ""} ${track.label ?? ""}`))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Don't match every Chinese variant as the target

For Chinese target codes, this fallback treats any Chinese-looking track as a match even when resolveLanguageCodeFromLocale resolved a different variant. If Netflix lists zh-Hans before zh-Hant (or Mandarin before Cantonese), a user configured for cmn-Hant or yue will receive the first Chinese subtitle track instead of the configured target; the broad label fallback should only run when the variant cannot be resolved, or exact variants should be ranked first.

Useful? React with 👍 / 👎.

@mengxi-ream

Copy link
Copy Markdown
Owner

I feel like if we want to support more streaming services like Netflix. We need a systematic adapter design. I didn't look at this PR carefully but I think it may not be the systematic way to add a supported website. If so, @taiiiyang yiu should close this PR and we should come up with an more systematic design in the future

@semantic-craft

Copy link
Copy Markdown
Author

Thanks, I agree that streaming sites should not grow as one-off implementations.

I pushed 6f0e010 to address the concrete review issues first:

  • Netflix browse pages no longer eagerly bootstrap the subtitles runtime. The content script now waits until the SPA reaches /watch/... before initializing the Netflix subtitle UI.
  • Discovered streaming tracks and captured subtitle responses are scoped to the current Netflix pathname, and the Netflix adapter now receives navigation events so a new title cannot reuse stale tracks.
  • Target subtitle selection now prefers exact resolved language variants before falling back to broad Chinese labels. I added tests for cmn-Hant vs zh-Hans ordering and previous-title track isolation.

On the design concern: this PR is intended as a narrow first streaming implementation, not a new parallel subtitle UI. It already goes through the existing seam:

  • PlatformConfig handles selectors, video identity, and navigation.
  • SubtitlesFetcher handles provider-specific official track discovery/parsing.
  • UniversalVideoAdapter keeps the shared UI, scheduler, translation, download, and display behavior.

I agree the next streaming service should probably come after a small streaming platform registry/extraction, rather than adding another host directly in this PR. Max/HBO looks like a useful follow-up validation target for the generic discovery approach, but I would keep it out of this PR unless maintainers prefer to see the abstraction first.

Local validation:

  • pnpm exec vitest run src/utils/subtitles/__tests__/netflix-fetcher.test.ts
  • pnpm type-check
  • pnpm exec eslint
  • SKIP_FREE_API=true pnpm test
  • WXT_SKIP_ENV_VALIDATION=true pnpm exec wxt build
  • pre-push hook passed lint/type-check/test

@semantic-craft

Copy link
Copy Markdown
Author

One more product-context note from my side: I have personally been a paid Immersive Translate subscriber, and the main reason I paid for it was bilingual subtitles. So I think official bilingual streaming subtitles are a genuinely competitive direction for Read Frog, not just a niche implementation detail.

My expectation is that heavy Immersive Translate users are also likely to have this need: watching streaming video with bilingual subtitles, especially when they are learning from official localized subtitle assets instead of only machine-translated captions.

I can help validate this systematically because I have subscriptions for Paramount and HBO/Max as well, and can also look at Disney+. I do not think those should be added into this PR one by one. A better next step may be to turn Netflix into the first verified implementation, then test Paramount, HBO/Max, and Disney+ to identify the common adapter boundary: track discovery, native subtitle hiding, current-title scoping, subtitle format parsing, and platform navigation behavior.

That would let us decide whether the right follow-up is a small streaming-platform adapter registry/design, rather than guessing from Netflix alone.

@semantic-craft

Copy link
Copy Markdown
Author

Closing in favor of #1757, which is the systematic adapter framework you asked for here — Netflix is the first adapter, and HBO Max already works as a second adapter on the same framework (a ~386-line delta), so the design is proven to generalize rather than being a one-off. Thanks for the steer toward a systematic design.

@semantic-craft

Copy link
Copy Markdown
Author

Withdrawing this official-track approach in favor of #1758.

This PR's goal was official Chinese subtitle + official English subtitle comparison. I still want to support that later, but the full official-track alignment path adds enough parsing, matching, and provider-specific code that compressing it under the current small-PR / sub-1000-line review target would make the implementation incomplete.

The replacement PR first lands the smaller, repeatedly locally tested streaming-caption path: capture the player-rendered English cue and translate it through Read Frog's existing pipeline, similar in user-facing behavior to Immersive Translate. That generic adapter shape has been tested locally on Netflix and HBO Max; #1758 only enables Netflix so the diff stays small and reviewable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contrib-trust:new PR author trust score is 0-29. feat needs-maintainer-review Contributor trust automation recommends maintainer review.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants