Update inference_cli.py Long videos and low RAM refactor by Will-I4M · Pull Request #462 · numz/ComfyUI-SeedVR2_VideoUpscaler

Will-I4M · 2025-12-31T10:46:07Z

Multi-GPU streaming refactor: added --recycle_workers_every and changed dispatch to process per-GPU sub-segments in cycles, respawning workers each cycle to release allocator memory.
Spill-based stitching: introduced disk spill per chunk and incremental stitcher that blends overlaps while streaming to a persistent writer (or PNGs) without holding full segments in RAM.
Output controls: ffmpeg writer now supports configurable codec/pix_fmt/bitrate/CRF/preset and higher bit depths; spill files stored in uint8/uint16 based on --output_bitdepth.
Resilience: added OOM retry with cleanup/backoff for chunk processing; disk saves use atomic writes with retries.
Diagnostics: RAM telemetry per chunk/phase, per-PID monitors for workers, and detailed logging around chunk processing, saving, and stitching.
Input handling: explicit freeing of input tensors (new_frames) between streamed chunks to reduce transient RAM.
CLI additions: --spill_dir, --output_bitdepth, --video_codec, --video_pix_fmt, --video_crf, --video_bitrate, --video_preset, --recycle_workers_every.
Minor safety: pre-parse CUDA device; auto-enable streaming for long videos when --chunk_size is unset to avoid full loads.

- Multi-GPU streaming refactor: added --recycle_workers_every and changed dispatch to process per-GPU sub-segments in cycles, respawning workers each cycle to release allocator memory. - Spill-based stitching: introduced disk spill per chunk and incremental stitcher that blends overlaps while streaming to a persistent writer (or PNGs) without holding full segments in RAM. - Output controls: ffmpeg writer now supports configurable codec/pix_fmt/bitrate/CRF/preset and higher bit depths; spill files stored in uint8/uint16 based on --output_bitdepth. - Resilience: added OOM retry with cleanup/backoff for chunk processing; disk saves use atomic writes with retries. - Diagnostics: RAM telemetry per chunk/phase, per-PID monitors for workers, and detailed logging around chunk processing, saving, and stitching. - Input handling: explicit freeing of input tensors (new_frames) between streamed chunks to reduce transient RAM. - CLI additions: --spill_dir, --output_bitdepth, --video_codec, --video_pix_fmt, --video_crf, --video_bitrate, --video_preset, --recycle_workers_every. - Minor safety: pre-parse CUDA device; auto-enable streaming for long videos when --chunk_size is unset to avoid full loads.

adrientoupet · 2026-01-01T09:14:19Z

Thank you for the PR @Will-I4M - there is quite a lot of changes. Did you test it with single and multi gpu? Including on long videos? Anything in the code update I need to consider when doing the review?

I'll try to get to it next week, I'm unavailable this week. Thanks again.

Will-I4M · 2026-01-01T18:18:53Z

Thank you for the PR @Will-I4M - there is quite a lot of changes. Did you test it with single and multi gpu? Including on long videos? Anything in the code update I need to consider when doing the review?

I'll try to get to it next week, I'm unavailable this week. Thanks again.

Indeed, my apologies for pushing several feature changes all at once. Perhaps it would be wiser to split the inference_cli into two parts to potentially keep some common sections and specialize one version in low RAM usage. In this PR, without using --debug, the code displays several messages about RAM usage, which isn't necessarily desirable for everyone, even though I find it very useful given SeedVR2's high RAM and VRAM requirements. I tested it in single-GPU and multi-GPU configurations. My main focus in this PR was RAM usage. There's generally better resilience to VRAM-related OOMs, but that's not the biggest change. In the initial version, all chunks are kept in RAM after processing until the final step: that's the biggest change, because in this version, only what's necessary (stitching, etc.) is kept in RAM. In the initial version, in multi-GPU mode, if a single process crashed during calculation (out of order...), the others would still continue unnecessarily until the end. The code is now more resilient and retries after somewhat aggressive memory cleanup attempts. I tested several strategies to free up RAM, all of which proved unsuccessful, until this latest version: the system no longer swaps unnecessarily. I was able to process several hours of video using the 7B model (fp16/4k/21 frames of context) on 4x3090 in a reasonable time, something I couldn't do before (even for durations of 25 minutes), so I think this work can be useful to the SeedVR2 user community.

shodan5000 · 2026-01-02T02:26:50Z

Pardon my ignorance, is this something that would, or could, benefit the ComfyUI version as well?

Will-I4M · 2026-01-05T13:14:32Z

Pardon my ignorance, is this something that would, or could, benefit the ComfyUI version as well?

Not with this PR. But it could be usefull to refactor the comfyUI version as well

cd1188 · 2026-01-10T21:10:15Z

Hi.
If I'm using chunk-streaming with "--video_backend ffmpeg" it always hangs on the last chunk without error...
I guess there is a problem on the last write in the output file.
If I remove the "--video_backend" switch, it works, but opencv is then using h263 ?!
The standard inference_cli.py from seedvr2.5.24 is working fine with ffmpeg and --chunk_size 44
Don't ask why 44 .. its my workflow :)

BTW.. your file has still the bug about "--prepend_frames" not working.
Link is issue-472

thehhmdb · 2026-01-17T18:02:53Z

I have the same problem as above with --video_backend ffmpeg hanging.

Will-I4M · 2026-01-18T17:31:49Z

Ok thank you, I'll have a look on this on the next week.

…

On Sat, Jan 17, 2026 at 7:03 PM thehhmdb ***@***.***> wrote: *thehhmdb* left a comment (numz/ComfyUI-SeedVR2_VideoUpscaler#462) <#462 (comment)> I have the same problem as above with --video_backend ffmpeg hanging. — Reply to this email directly, view it on GitHub <#462 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACO6VUXQYPT7T3SP62GNDDT4HJ2OFAVCNFSM6AAAAACQMJLDS6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTONRUGE3DMNRSGQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

The FFMPEGVideoWriter.release() method called self.proc.wait() without a timeout, causing the entire process to hang forever if ffmpeg didn't terminate cleanly after stdin was closed. This was reported in PR numz#462 where chunk-streaming with --video_backend ffmpeg would hang on the last chunk without any error message. Changes: - Add 120s timeout to proc.wait() with SIGTERM/SIGKILL escalation - Capture ffmpeg stderr via background thread for diagnostics (previously stderr=DEVNULL swallowed all error output) - Add -loglevel warning to ffmpeg args to minimize stderr volume while still capturing meaningful errors - Include ffmpeg stderr in error messages for write() and release() - Add __del__ safety net to clean up ffmpeg process on GC https://claude.ai/code/session_01Qqj52TTGPz6BPHFGMYqALr

ovo-Tim mentioned this pull request Feb 5, 2026

The memory usage still keeps increasing in streaming mode. #445

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update inference_cli.py Long videos and low RAM refactor#462

Update inference_cli.py Long videos and low RAM refactor#462
Will-I4M wants to merge 1 commit intonumz:mainfrom
Will-I4M:main

Will-I4M commented Dec 31, 2025

Uh oh!

adrientoupet commented Jan 1, 2026

Uh oh!

Will-I4M commented Jan 1, 2026

Uh oh!

shodan5000 commented Jan 2, 2026

Uh oh!

Will-I4M commented Jan 5, 2026 •

edited

Loading

Uh oh!

cd1188 commented Jan 10, 2026 •

edited

Loading

Uh oh!

thehhmdb commented Jan 17, 2026

Uh oh!

Will-I4M commented Jan 18, 2026 via email

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

Will-I4M commented Dec 31, 2025

Uh oh!

adrientoupet commented Jan 1, 2026

Uh oh!

Will-I4M commented Jan 1, 2026

Uh oh!

shodan5000 commented Jan 2, 2026

Uh oh!

Will-I4M commented Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cd1188 commented Jan 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thehhmdb commented Jan 17, 2026

Uh oh!

Will-I4M commented Jan 18, 2026 via email

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Will-I4M commented Jan 5, 2026 •

edited

Loading

cd1188 commented Jan 10, 2026 •

edited

Loading