-
Notifications
You must be signed in to change notification settings - Fork 6.7k
[feat] Add UCGM Scheduler #12912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[feat] Add UCGM Scheduler #12912
Conversation
|
Tests for Z-Image-Turbo from diffusers import QwenImagePipeline, ZImagePipeline
from diffusers.schedulers.scheduling_ucgm_anystep import UCGMScheduler
import torch
torch_dtype = torch.bfloat16
sampler = UCGMScheduler(
stochast_ratio=0.0,
extrapol_ratio=0.0,
time_dist_ctrl=[1.1, 0.8, 1.1],
rfba_gap_steps=[0.001, 0.0],
sampling_style="mul", # "few", "mul", "any"
)
pipe = ZImagePipeline.from_pretrained("Tongyi-MAI/Z-Image-Turbo", scheduler=sampler, torch_dtype=torch_dtype)
pipe = pipe.to(device)
prompt = "一张逼真的年轻东亚女性肖像,位于画面中心偏左的位置,带着浅浅的微笑直视观者。她身着以浓郁的红色和金色为主的传统中式服装。她的头发被精心盘起,饰有精致的红色和金色花卉和叶形发饰。她的眉心之间额头上绘有一个小巧、华丽的红色花卉图案。她左手持一把仿古扇子,扇面上绘有一位身着传统服饰的女性、一棵树和一只鸟的场景。她的右手向前伸出,手掌向上,托着一个悬浮的发光的霓虹黄色灯牌,上面写着“TwinFlow So Fast”,这是画面中最亮的元素。背景是模糊的夜景,带有暖色调的人工灯光,一场户外文化活动或庆典。在远处的背景中,她头部的左侧略偏,是一座高大、多层、被暖光照亮的西安大雁塔。中景可见其他模糊的建筑和灯光,暗示着一个繁华的城市或文化背景。光线是低调的,灯牌为她的脸部和手部提供了显著的照明。整体氛围神秘而迷人。人物的头部、手部和上半身完全可见,下半身被画面底部边缘截断。图像具有中等景深,主体清晰聚焦,背景柔和模糊。色彩方案温暖,以红色、金色和闪电的亮黄色为主。"
negative_prompt = " " # using an empty string if you do not have specific concept to remove
width, height = (768, 1024)
image = pipe(
prompt=prompt,
negative_prompt=negative_prompt,
width=width,
height=height,
num_inference_steps=8,
guidance_scale=0.0,
generator=torch.Generator(device="cuda").manual_seed(0),
).images[0]
image.save("example.png")Tests for Qwen-Image: from diffusers import QwenImagePipeline, ZImagePipeline
from diffusers.schedulers.scheduling_ucgm_anystep import UCGMScheduler
import torch
torch_dtype = torch.bfloat16
sampler = UCGMScheduler(
stochast_ratio=0.0,
extrapol_ratio=0.0, # set to 0.5 to reduce sampling steps for free
time_dist_ctrl=[1.1, 0.8, 1.1],
rfba_gap_steps=[0.001, 0.0],
sampling_style="mul", # "few", "mul", "any"
)
pipe = QwenImagePipeline.from_pretrained("Qwen/Qwen-Image", scheduler=sampler, torch_dtype=torch_dtype)
pipe = pipe.to(device)
prompt = "一张逼真的年轻东亚女性肖像,位于画面中心偏左的位置,带着浅浅的微笑直视观者。她身着以浓郁的红色和金色为主的传统中式服装。她的头发被精心盘起,饰有精致的红色和金色花卉和叶形发饰。她的眉心之间额头上绘有一个小巧、华丽的红色花卉图案。她左手持一把仿古扇子,扇面上绘有一位身着传统服饰的女性、一棵树和一只鸟的场景。她的右手向前伸出,手掌向上,托着一个悬浮的发光的霓虹黄色灯牌,上面写着“TwinFlow So Fast”,这是画面中最亮的元素。背景是模糊的夜景,带有暖色调的人工灯光,一场户外文化活动或庆典。在远处的背景中,她头部的左侧略偏,是一座高大、多层、被暖光照亮的西安大雁塔。中景可见其他模糊的建筑和灯光,暗示着一个繁华的城市或文化背景。光线是低调的,灯牌为她的脸部和手部提供了显著的照明。整体氛围神秘而迷人。人物的头部、手部和上半身完全可见,下半身被画面底部边缘截断。图像具有中等景深,主体清晰聚焦,背景柔和模糊。色彩方案温暖,以红色、金色和闪电的亮黄色为主。"
negative_prompt = " " # using an empty string if you do not have specific concept to remove
width, height = (768, 1024)
image = pipe(
prompt=prompt,
negative_prompt=negative_prompt,
width=width,
height=height,
num_inference_steps=50,
true_cfg_scale=4.0,
generator=torch.Generator(device="cuda").manual_seed(0),
).images[0]
image.save("example.png") |
|
thanks @QAQdev, can you please post the result image examples of this scheduler? |
@asomoza Sure. Standard ODE samplingFor Z-Image-Turbo, 8 steps, no cfg. Using UCGM sampler:
ODE sampling with extrapolationFor Qwen-Image, 50 steps, cfg=4.0 |
|
thanks! it looks good, it's nice to have another scheduler. |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
yiyixuxu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to add this scheduler! However, can you add a doc page first to explain the different use cases? On first glance I have a few questions/confusions:
How do you come up with time_dist_ctrl / rfba_gap_steps values?
Does extrapolation work on distilled checkpoints like Z-Image-Turbo? If not, what's the benefit of using UCGM there?
Hi, thanks for the timely reply. I follow the original implementation of UCGM: https://github.com/LINs-lab/UCGM/blob/main/methodes/unigen.py#L310-L451
Besides, may I ask how to add a doc page here? This is my first time to PR. |




What does this PR do?
This PR introduces the UCGM sampler (paper: https://arxiv.org/abs/2505.07447). The UCGM sampler adds extrapolation parameters that can reduce the number of sampling steps for free.
In addition, considering the development of any-step generative models, extra sampling modes have been added to simultaneously support few-step, any-step, and classic multi-step sampling.
Before submitting
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.