Amazing image quality, problematic for video (and other notes) #200

mindplay-dk · 2025-10-13T18:46:00Z

mindplay-dk
Oct 13, 2025

I've been tinkering with this model, and it is quite impressive!

I've never seen image, much less video upscaling quality like this - I honestly think this makes expensive products like Topaz look rather inferior.

That said, it seems to work better for images than for video, because of the batch size limitation.

I'm upscaling a video from 1080 to 1440 on system with an RTX 4060 Ti 16GB (and an AMD 7900 12-core with 32GB RAM) and it took a long time to find a configuration that worked at all:

As you can see, I've disconnected the BlockSwap Config, because it didn't seem to help with memory issues at all - I'm not sure if it's currently working? In the tutorial video, I saw the console output saying when it swapped the blocks, but that does not seem to happen on my system.

I had to enable the tiled_vae feature and preserve_ram, and even then, I have not been able to push it beyond a batch_size of 9.

With such a small batch size, the artifacts between batches are very frequent and very visible.

Now, for this kind of quality, and since it's just a one-time thing to restore my old videos, I wouldn't mind paying for a few hours of GPU cloud at all - but I don't see how this is going to work very well, even on a 40GB GPU.

I mean, ideally, you'd want to process in batches large enough to batch between key frames, but I can't see how that's going to happen.

Do you have anything else planned for this project?

Like, maybe there's a way to work in batches of 5 and somehow "transfer" the fine noise/details to the first frame of the next batch? (full disclosure: I have no idea what I'm talking about.)

Alternatively, maybe it makes sense to train a smaller model? Maybe 2B or 1.5B? I'm using the 3B model, and it is already way beyond the quality of anything else I've ever seen, so. :-)

I did try the Q4 version, and it actually doesn't look worse than the FP8 version - but it seems to use about the same amount of VRAM? For that matter, the FP16 version doesn't look better to me either - if anything, maybe it looks slightly more noisy? It does use more VRAM, and it is considerably slower. In a side-by-side comparison between the Q4 and FP16, I can't spot any significant difference. (though I'd assume maybe the material in question might makes some difference...)

As for the preserve_vram option, is this really useful? My VRAM peaks at 68% instead of 70% and the processing time drops from 0.12 to 0.07 fps - doesn't seem like a meaningful tradeoff?

With regards to the experimentation you need to do to find settings for acceptable memory usage, there's also this very time consuming retry that it performs:

⚠️ [WARNING] OOM during GroupNorm.concat_chunks: Allocation on device 
ℹ️ Clearing memory and retrying
✅ Retry successful for GroupNorm.concat_chunks
⚠️ [WARNING] OOM during InflatedCausalConv3d.concat_splits: Allocation on device 
ℹ️ Clearing memory and retrying

That first attempt is very fast - then for the second attempt it's more than a full minute, just to fail. I wish it wouldn't even try the retry, because retrying has never once worked - it just seems like a very long wait for a much longer explanation just to essentially say you're out of RAM.

One last thought: does it make sense to pick the tile size in pixels? Wouldn't it make more sense to say e.g. "6 tiles" or "8 tiles" etc. and just figure out the least wasteful tile size? I mean, I could probably figure out how to compute it myself, but...

By the way, there is no visible down side to using tiles, as far as I can tell? Does it significantly increase the processing time? If not, it might be a good idea to keep this on by default.

Anyhow, this is all I could think of. It is a truly great model! I only wish there were some way beat the batch size limitation. It's almost better to use a smaller batch size right now, at least then it becomes more sort of a constant noise and less of a jarring distraction when the noise skips. That's definitely my biggest issue with this model.

adrientoupet · 2025-10-13T18:51:17Z

adrientoupet
Oct 13, 2025
Collaborator

Hi there,

I won't reply right now, would be too long to cover all your points. I'm planning to release a new tutorial with the next release of SeedVR2 - lets talk again once that's out.as that should cover most of your issues.

Cheers

0 replies

naxci1 · 2025-10-13T19:10:16Z

naxci1
Oct 13, 2025

Hi @mindplay-dk

I've been using Topaz for years and thought it was the best, but after finding SeedVR2 months ago, and having used over 10 video upscaling apps, I can honestly say that there's no app that even comes close to SeedVR2's quality. I extend my sincere thanks to everyone who contributed to this.

I initially struggled with the settings, but that's how I found the "golden setting." Because every hardware is different, there's a "golden setting" specific to each. But I recommend using the full nightly version, the 3b Q8 model, a batch_size of at least 45. I used to use 101, but now I use 121 (I have 16 GB of VRAM). Most importantly, make sure your input video is smaller so you can upscale it by at least 2x-3x. I recommend 3x for the best quality. Reduce the video size by 2x and shorten the video length. I'm using a 3-minute video at 15 fps. My current videos are 320x240; I upscale them 3x, and the results are great. Topaz SLM and SLS can't produce images of this quality. I've also been testing them, and I've previously shared a comparison on the Topaz Forum, but the company couldn't stand the difference in quality and deleted them all from the forum.

You can also find those comparisons here. I'm sharing my workflow, so you can follow it. I have a 5070 Ti 16GB, which is similar to your system.

2 replies

mindplay-dk Oct 13, 2025
Author

I mean, this makes sense if you have really small videos? Mine aren't that small. I have 1080 and want 1440.

I was testing with the Load Video node using a frame_load_cap of 20 just for testing - now I want to process an entire video, but it didn't even get past loading the video. I think it's trying to decode every frame of the entire video into RAM?? My C: drive dropped by 100 GB temporarily, then the Load Video node crashes with out-of-memory. 😅

Have you used the meta_batch feature? Presumably that's what's needed to process a larger file?

I'm new to Comfy so I probably have a lot to learn.

Would be nice if there was a published workflow for this node that works for real workloads. 😊

naxci1 Oct 13, 2025

Yani, gerçekten küçük videolarınız varsa bu mantıklı. Benimkiler o kadar küçük değil. Benim 1080'im var ve 1440 istiyorum.

Can you provide a video example?

You simply reduce your video by 50% with Shutter Encoder, then upscale it by 2x or 3x. Send me a video example and I'll test it for you.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Amazing image quality, problematic for video (and other notes) #200

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Amazing image quality, problematic for video (and other notes) #200

Uh oh!

mindplay-dk Oct 13, 2025

Replies: 2 comments · 2 replies

Uh oh!

adrientoupet Oct 13, 2025 Collaborator

Uh oh!

naxci1 Oct 13, 2025

Uh oh!

mindplay-dk Oct 13, 2025 Author

Uh oh!

naxci1 Oct 13, 2025

mindplay-dk
Oct 13, 2025

Replies: 2 comments 2 replies

adrientoupet
Oct 13, 2025
Collaborator

naxci1
Oct 13, 2025

mindplay-dk Oct 13, 2025
Author