Amazing image quality, problematic for video (and other notes) #200
Replies: 2 comments 2 replies
-
Hi there, I won't reply right now, would be too long to cover all your points. I'm planning to release a new tutorial with the next release of SeedVR2 - lets talk again once that's out.as that should cover most of your issues. Cheers |
Beta Was this translation helpful? Give feedback.
-
Hi @mindplay-dk I've been using Topaz for years and thought it was the best, but after finding SeedVR2 months ago, and having used over 10 video upscaling apps, I can honestly say that there's no app that even comes close to SeedVR2's quality. I extend my sincere thanks to everyone who contributed to this. I initially struggled with the settings, but that's how I found the "golden setting." Because every hardware is different, there's a "golden setting" specific to each. But I recommend using the full nightly version, the 3b Q8 model, a batch_size of at least 45. I used to use 101, but now I use 121 (I have 16 GB of VRAM). Most importantly, make sure your input video is smaller so you can upscale it by at least 2x-3x. I recommend 3x for the best quality. Reduce the video size by 2x and shorten the video length. I'm using a 3-minute video at 15 fps. My current videos are 320x240; I upscale them 3x, and the results are great. Topaz SLM and SLS can't produce images of this quality. I've also been testing them, and I've previously shared a comparison on the Topaz Forum, but the company couldn't stand the difference in quality and deleted them all from the forum. You can also find those comparisons here. I'm sharing my workflow, so you can follow it. I have a 5070 Ti 16GB, which is similar to your system. ![]() |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I've been tinkering with this model, and it is quite impressive!
I've never seen image, much less video upscaling quality like this - I honestly think this makes expensive products like Topaz look rather inferior.
That said, it seems to work better for images than for video, because of the batch size limitation.
I'm upscaling a video from 1080 to 1440 on system with an RTX 4060 Ti 16GB (and an AMD 7900 12-core with 32GB RAM) and it took a long time to find a configuration that worked at all:
As you can see, I've disconnected the BlockSwap Config, because it didn't seem to help with memory issues at all - I'm not sure if it's currently working? In the tutorial video, I saw the console output saying when it swapped the blocks, but that does not seem to happen on my system.
I had to enable the
tiled_vae
feature andpreserve_ram
, and even then, I have not been able to push it beyond abatch_size
of 9.With such a small batch size, the artifacts between batches are very frequent and very visible.
Now, for this kind of quality, and since it's just a one-time thing to restore my old videos, I wouldn't mind paying for a few hours of GPU cloud at all - but I don't see how this is going to work very well, even on a 40GB GPU.
I mean, ideally, you'd want to process in batches large enough to batch between key frames, but I can't see how that's going to happen.
Do you have anything else planned for this project?
Like, maybe there's a way to work in batches of 5 and somehow "transfer" the fine noise/details to the first frame of the next batch? (full disclosure: I have no idea what I'm talking about.)
Alternatively, maybe it makes sense to train a smaller model? Maybe 2B or 1.5B? I'm using the 3B model, and it is already way beyond the quality of anything else I've ever seen, so. :-)
I did try the Q4 version, and it actually doesn't look worse than the FP8 version - but it seems to use about the same amount of VRAM? For that matter, the FP16 version doesn't look better to me either - if anything, maybe it looks slightly more noisy? It does use more VRAM, and it is considerably slower. In a side-by-side comparison between the Q4 and FP16, I can't spot any significant difference. (though I'd assume maybe the material in question might makes some difference...)
As for the
preserve_vram
option, is this really useful? My VRAM peaks at 68% instead of 70% and the processing time drops from 0.12 to 0.07 fps - doesn't seem like a meaningful tradeoff?With regards to the experimentation you need to do to find settings for acceptable memory usage, there's also this very time consuming retry that it performs:
That first attempt is very fast - then for the second attempt it's more than a full minute, just to fail. I wish it wouldn't even try the retry, because retrying has never once worked - it just seems like a very long wait for a much longer explanation just to essentially say you're out of RAM.
One last thought: does it make sense to pick the tile size in pixels? Wouldn't it make more sense to say e.g. "6 tiles" or "8 tiles" etc. and just figure out the least wasteful tile size? I mean, I could probably figure out how to compute it myself, but...
By the way, there is no visible down side to using tiles, as far as I can tell? Does it significantly increase the processing time? If not, it might be a good idea to keep this on by default.
Anyhow, this is all I could think of. It is a truly great model! I only wish there were some way beat the batch size limitation. It's almost better to use a smaller batch size right now, at least then it becomes more sort of a constant noise and less of a jarring distraction when the noise skips. That's definitely my biggest issue with this model.
Beta Was this translation helpful? Give feedback.
All reactions