Skip to content

Producer 0.6.8+: streaming-encode SIGTERM with audio:0kB for compositions with auto-injected video timing #899

@terencecho

Description

@terencecho

Summary

After bumping @hyperframes/producer from 0.6.70.6.10 in a downstream repo, one of three regression fixtures (style-13) started failing deterministically with Streaming encode failed: FFmpeg exited with code 255. The other two (style-7, style-16) pass with the same bump.

Reproduced across two consecutive CI runs with different runner conditions; not a flake. Looks like it was introduced in 0.6.8 via #832 (perf(engine): faster shader transitions via page-side WebGL compositing), which bundled three coupled changes:

  1. Unconditional data-hf-auto-start sentinel injection in packages/core/src/compiler/timingCompiler.ts (every <video>/<audio> without data-start)
  2. New discoverVideoVisibilityFromTimeline() in packages/producer/src/services/htmlCompiler.ts that overwrites video.start/video.end with opacity-derived windows
  3. New enablePageSideCompositing (default true) bypassing the layered shader-blend path

HF_PAGE_SIDE_COMPOSITING=false does not fix it — the auto-start injection in timingCompiler.ts is unconditional.

Failure mode

{"event":"test_error","suite":"style-13-prod","error":"Streaming encode failed: FFmpeg exited with code 255\nffmpeg stderr (tail):\nvideo:13530kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.020131%\n[libx264 @ ...] frame I:3     Avg QP:12.91  size: 73263\n[libx264 @ ...] frame P:431   Avg QP:14.72  size: 31633\n...\n[libx264 @ ...] kb/s:7661.05\nExiting normally, received signal 15."}

Key signals:

  • libx264 prints its full end-of-encode stats (frame I:3, P:431, kb/s:7661) → encoder finished encoding every frame
  • Exiting normally, received signal 15 → SIGTERM (not an internal crash)
  • audio:0kB is expected for this stage — streamingEncoder.ts is video-only; audio comes via assembleStagemuxVideoWithAudio later
  • Exit code 255 (not 143) — matches streamingEncoder.ts:421-425 safety timeout firing during the post-frame flush

So FFmpeg encoded all 16s / ~480 frames cleanly, then was SIGTERM'd in the flush/teardown window by the 600s ffmpegStreamingTimeout safety timer in streamingEncoder.ts. The wrapper sees non-zero exit and reports "Streaming encode failed".

Why style-13 specifically

Comparing run telemetry:

  • style-13: staticDuration:16, width:1080, height:1920, audioCount:1, videoCount:1 — no Google Fonts fetch
  • style-16: staticDuration:13.88, width:1080, height:1920, audioCount:1, videoCount:1 — fetches Impact
  • style-7: staticDuration:16.7, width:1920, height:1080, audioCount:1, videoCount:1 — 4 font families

Frame-capture calibration p95Ms was actually higher on style-16 (6094ms, multiplier 8) than style-13 (1708-2399ms, multiplier 2.85-4) and style-16 passed — so the regression isn't simple slowness. Something in style-13's composition shape interacts with one of the three changes in #832.

Strongest suspect: <video> element with no explicit data-start → auto-tagged with data-hf-auto-startdiscoverVideoVisibilityFromTimeline() overwrites video.start/video.end with opacity-binary-searched window. If that window is large or the binary-search adds enough wall-clock time to the probe + render pipeline, the 600s ffmpegStreamingTimeout fires during flush.

Downstream impact

heygen-com/hyperframes-internal PR #328 wants to bump from 0.6.70.6.10 specifically to pick up the lottieReadiness + import.meta.env fixes from #861 (so it can drop two local patches). All three of 0.6.8 / 0.6.9 / 0.6.10 include the regression, and 0.6.7 is missing #861 — so there's no version that gives us both.

What would unblock us

Any of:

  1. Make the auto-injected sentinel + discoverVideoVisibilityFromTimeline() opt-in (config flag or env var, default off). Existing fixtures with explicit data-start are unaffected.
  2. Make discoverVideoVisibilityFromTimeline() non-destructive: only override video.start/video.end when the original values came from auto-injection AND the discovered window is strictly larger than ~1 frame, AND falls inside [0, duration].
  3. Diagnose and fix the actual cause; ship as 0.6.11.

Happy to send a PR for (1) or (2) if useful — we have a downstream test that flips green/red on this. Just wanted to file the analysis first since the root cause inside the producer pipeline isn't fully pinned down from the outside.

Repro environment

  • Linux CI runner (Dockerfile.test in heygen-com/hyperframes-internal), in-process render mode
  • @hyperframes/producer@0.6.10 + downstream @app/producer-internal
  • Composition: 1080×1920, 16s, 1 video element, 1 audio element, 30fps target
  • Output: SDR mp4, streaming-encode path enabled (default)

cc anyone touching #832 / discoverVideoVisibilityFromTimeline.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions