Skip to content

Conversation

@anastasiuspernat
Copy link
Contributor

Add FlashVSR context window support for long videos

Problem

FlashVSR video upscaling with context windows on long videos (500+ frames) causes:

  1. OOM errors during decoding (115+ latent frames)
  2. Visible transitions at chunk boundaries due to latent blending if directly using context window

Solution

  1. New flashvsr fuse method (context_windows/context.py)
    No-blend mode for FlashVSR: overlap provides temporal context, but predictions are discarded (not blended).
  • First chunk: weight=1.0 for all frames
  • Later chunks: weight=0 for overlap, weight=1.0 for new frames
  1. WanVideoDecode uses context_window settings passed from Wan Context Window options and does
    chunked decoding with overlap trimming (nodes.py)
  • Decode chunks with overlap for temporal context
  • Discard overlap frames (no blending)
  • Key fix: Proportional overlap calculation accounts for FlashVSR's frames_to_trim=3 behavior (outputs 41 frames from 11 latent, not 44 if context_frames is set to 44)
    Calculation: actual_overlap = decoder_output × (latent_overlap / latent_context_frames)
  1. Updated WanVideoContextOptions
    Added flashvsr to fuse_method dropdown

Usage Example specific for FlashVSR

  context_schedule: "static_standard"
  context_frames: 44  # Adjust for VRAM
  context_overlap: 16
  fuse_method: "flashvsr"  # NEW

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant