Skip to content

feat(subtitles): add official bilingual streaming captions#1749

Closed
semantic-craft wants to merge 2 commits into
mengxi-ream:mainfrom
semantic-craft:feat/streaming-subtitles
Closed

feat(subtitles): add official bilingual streaming captions#1749
semantic-craft wants to merge 2 commits into
mengxi-ream:mainfrom
semantic-craft:feat/streaming-subtitles

Conversation

@semantic-craft

@semantic-craft semantic-craft commented Jun 27, 2026

Copy link
Copy Markdown

Summary

This PR adds a streaming bilingual subtitle path for Netflix-style players. When both official source and target subtitle tracks are available, Read Frog now renders the official original subtitle together with the official localized subtitle instead of always falling back to machine translation.

Concretely, this adds support for:

  • discovering official subtitle tracks from streaming manifests and late JSON/XHR/fetch responses
  • pairing official English captions with official target-language subtitles, including Chinese Traditional on Netflix
  • aligning English cues to the target-language cue timing with maximum-overlap matching, so cue splits do not truncate or duplicate lines
  • preserving the user's selected native Netflix subtitle track instead of forcing English when a usable target subtitle is already enabled
  • falling back to native/live subtitle capture when official tracks are unavailable
  • parsing Netflix TTML tick-based cue timing

Product advantage

The main product difference is that this is not just another machine-translated subtitle overlay.

A well-known extension such as Immersive Translate usually solves video bilingual subtitles by taking the selected English subtitle and generating a machine translation. That is useful when no official translation exists, but it cannot provide a perfect official-to-official comparison: the localized line is generated by the translation provider, not taken from the streaming platform's official subtitle asset.

Read Frog can now do better on Netflix when both tracks are present. It keeps the official Chinese Traditional subtitle selected in Netflix, captures the official English caption track separately, and renders them together as a matched bilingual pair. In other words, the viewer gets official English above official Chinese, rather than English plus a machine-generated Chinese translation. This is a stronger experience for language learning because users can compare the original wording with the platform-approved localization.

Why this was hard

Official subtitle tracks are not segmented the same way across languages. Netflix may split one English sentence across multiple cues while the Chinese subtitle uses a single cue, or the reverse. A simple timestamp overlap would either truncate the English line, duplicate translations, or attach a source cue to the wrong localized line.

The implementation addresses this by using the target-language subtitle timing as the display baseline and assigning each source cue to the target cue with the largest time overlap. That lets Read Frog display complete official English lines above the corresponding official Chinese subtitle, even when the underlying tracks use different cue boundaries.

Implementation notes

  • The page-world interceptor now scans nested manifest-like JSON for timed text downloadables and replays cached tracks when the subtitles runtime starts later.
  • Official track selection filters out forced/image/watch-route noise and prefers caption-like English tracks over original-audio metadata tracks.
  • Official bilingual subtitles are pre-segmented and bypass machine translation when a valid official source/target pair is available.
  • Native subtitle fallback timing now updates live cue end times so fallback subtitles do not linger after the spoken line has ended.

Validation

  • pnpm type-check
  • pnpm exec eslint
  • SKIP_FREE_API=true pnpm vitest run src/utils/subtitles/__tests__/streaming-fetcher.test.ts src/entrypoints/interceptor.content/__tests__/streaming-subtitles-interceptor.test.ts src/entrypoints/subtitles.content/__tests__/universal-adapter.test.ts src/entrypoints/subtitles.content/__tests__/subtitles-scheduler.test.ts
  • SKIP_FREE_API=true pnpm test — 166 files passed, 1417 tests passed
  • WXT_SKIP_ENV_VALIDATION=true pnpm exec wxt build
  • Local Chrome extension validation on Netflix with official Chinese Traditional subtitles selected; confirmed official English captions render above the official Chinese subtitle line without truncating the English cue.

@changeset-bot

changeset-bot Bot commented Jun 27, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: cf89f0a

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@read-frog/extension Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@github-actions github-actions Bot added contrib-trust:new PR author trust score is 0-29. needs-maintainer-review Contributor trust automation recommends maintainer review. labels Jun 27, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Contributor trust score

19/100 — New contributor

This score estimates contributor familiarity with mengxi-ream/read-frog using public GitHub signals. It is advisory only and does not block merges automatically.

Outcome

Score breakdown

Dimension Score Signals
Repo familiarity 0/35 commits in repo, merged PRs, reviews
Community standing 6/25 account age, followers, repo role
OSS influence 8/20 stars on owned non-fork repositories
PR track record 5/20 merge rate across resolved PRs in this repo

Signals used

  • Repo commits: 0 (author commits reachable from the repository default branch)
  • Repo PR history: merged 0, open 1, closed-unmerged 0
  • Repo reviews: 0
  • PR counted changed lines: 2776 (+2700 / -76)
  • Repo permission: read
  • Followers: 2
  • Account age: 129 months
  • Owned non-fork repos considered: max 102, total 208 (semantic-craft/mac-tmux-kit (102), semantic-craft/iOS-vibebuddy (100), semantic-craft/raycast-doubao-tts (3), semantic-craft/vscode-ai-voice-studio (1), semantic-craft/Raycast-Gemini-TTS (1), semantic-craft/snapocr-via-paddle (1), semantic-craft/famo-releases (0), semantic-craft/famoime-mac (0), semantic-craft/responsay-legal-skills (0), semantic-craft/responsay-releases (0), semantic-craft/tmux-cheatsheet (0), semantic-craft/raycast-responsay (0), semantic-craft/parlance (0), semantic-craft/vscode-english-coach (0), semantic-craft/raycast-skill-manager (0), semantic-craft/Raycast-Mimo-TTS (0), semantic-craft/raycast-tmux (0), semantic-craft/raycast-ai-voice-studio (0), semantic-craft/english-speaking-training-vscode (0), semantic-craft/raycast-ai-translate (0))

Policy

  • Low-score review threshold: < 30
  • Auto-close: score < 20 and counted changed lines > 1000
  • Migration-related files are excluded from the auto-close line count
  • Policy version: v1.2

Updated automatically when the PR changes or when a maintainer reruns the workflow.

@github-actions github-actions Bot closed this Jun 27, 2026
@semantic-craft semantic-craft changed the title Add official bilingual subtitles for streaming video feat(subtitles): add official bilingual streaming captions Jun 27, 2026
@github-actions github-actions Bot added the feat label Jun 27, 2026

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cf89f0ae03

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

return

const normalizedUrl = normalizeUrl(url)
if (!normalizedUrl || !looksLikeSubtitleUrl(normalizedUrl))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Capture known subtitle track URLs without extensions

On Netflix the manifest URLs collected from ttDownloadables are often CDN paths such as /nflxvideo/... with no .vtt/caption token; the collector accepts those because the key is downloadUrls. In that case this guard drops the response even though trackByUrl already knows it is a subtitle, and the fetch hook uses the same predicate before calling captureSubtitle, so if backgroundFetch cannot replay the signed/authenticated request, waitForStreamingSubtitleCapture times out and official subtitles are lost. Accept normalized URLs present in trackByUrl as captures.

Useful? React with 👍 / 👎.

playerContainer: "body",
nativeSubtitles: STREAMING_NATIVE_SUBTITLES_SELECTOR,
},
events: {},

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reset the streaming adapter on SPA route changes

For streaming sites this config supplies no navigateStart/navigateFinish events, and the only notifyNavigation() call in the repo is the YouTube path, so UniversalVideoAdapter never runs resetForNavigation() when Netflix/Max/Disney push a new /watch/... URL without a reload. In that SPA navigation scenario the scheduler and session caches stay bound to the previous video, so active overlays can continue to show stale subtitles until a page reload or manual reset. Add history/webNavigation or platform events for this adapter.

Useful? React with 👍 / 👎.

targetCode: LangCodeISO6393,
): StreamingSubtitleTrack | null {
return getOfficialTrackCandidates(tracks).find(track =>
resolveTrackLanguage(track) === targetCode || (isChineseCode(targetCode) && isChineseTrack(track))) ?? null

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Prefer an exact Chinese subtitle variant first

When a title exposes both Simplified and Traditional Chinese, this .find() returns the first Chinese track for any Chinese target because isChineseCode(targetCode) && isChineseTrack(track) is true before later entries can be checked for an exact resolveTrackLanguage(track) === targetCode match. If the manifest order is zh-Hant then zh-Hans and the user target is cmn, the official overlay shows Traditional text. Do an exact-match pass before the generic Chinese fallback.

Useful? React with 👍 / 👎.

@semantic-craft

Copy link
Copy Markdown
Author

Withdrawing this official-track approach in favor of #1758.

This PR's goal was official Chinese subtitle + official English subtitle comparison. I still want to support that later, but the full official-track alignment path adds enough parsing, matching, and provider-specific code that compressing it under the current small-PR / sub-1000-line review target would make the implementation incomplete.

The replacement PR first lands the smaller, repeatedly locally tested streaming-caption path: capture the player-rendered English cue and translate it through Read Frog's existing pipeline, similar in user-facing behavior to Immersive Translate. That generic adapter shape has been tested locally on Netflix and HBO Max; #1758 only enables Netflix so the diff stays small and reviewable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contrib-trust:new PR author trust score is 0-29. feat needs-maintainer-review Contributor trust automation recommends maintainer review.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant