ByteDance
Overview: This work presents a one-step diffusion model for generative detail restoration, GenDR⚡, distilled from a tailored diffusion model with larger latent space, to eliminate the dilemma arised by misalignment between T2I and SR tasks. 1) SD2.1-VAE16: To expand a high-dimensional latent space without enlarging model size, we train a new SD2.1-VAE16 (0.9B) via representation alignment. 2) CiD/CiDA: We propose consistent score identity distillation (CiD) incorporating SR task-specific loss into score distillation to leverage more SR priors and align the training target. Furthermore, we extend CiD with adversarial learning and representation alignment (CiDA) to enhance perceptual quality and accelerate training.
More visual comparison from RealLR200 (click me)
The repo is still under construction.
- [2025.03] Repo and project created. The open-source process is in the audit stage and depends on the company policy. If all goes well, the code and weights will be released in the next two months.
We would thank Ostris, RealESRGAN, OSEDiff, SiD, etal for their enlightening work!
@article{wang2025gendr,
title={GenDR: Lightning Generative Detail Restorator},
author={Wang, Yan and Zhao, Shijie and Chen, Kai and Zhang, Kexin and Li, Junlin and Zhang, Li},
journal={arXiv preprint arXiv:2503.06790},
year={2025}