[Feature] TwinFlow: Qwen Image and Z-Image-Turbo

### Feature Summary

TwinFlow: Realizing One-step Generation on Large Models with Self-adversarial Flows

### Detailed Description

It's quite impressive how they made Qwen Image and ZIT exponentially faster.

https://huggingface.co/inclusionAI/TwinFlow-Z-Image-Turbo
https://huggingface.co/inclusionAI/TwinFlow
https://github.com/inclusionAI/TwinFlow

<img width="2752" height="1536" alt="Image" src="https://github.com/user-attachments/assets/18d21359-c7d7-4a5b-9fcf-61965207fdf4" />

<img width="1837" height="717" alt="Image" src="https://github.com/user-attachments/assets/b3100f62-7f74-41f8-8947-da1ffef08710" />
https://zhenglin-cheng.com/twinflow/

"Recent advances in large multi-modal generative models have demonstrated impressive capabilities in multi-modal generation, including image and video generation. These models are typically built upon multi-step frameworks like diffusion and flow matching, which inherently limits their inference efficiency, requiring 40-100 Number of Function Evaluations (NFEs). While various few-step methods aim to accelerate the inference, existing solutions have clear limitations. Prominent distillation-based methods, such as progressive and consistency distillation, either require an iterative distillation procedure or show significant degradation at very few steps (< 4-NFE). Meanwhile, integrating adversarial training into distillation (e.g., DMD/DMD2 and SANA-Sprint) to enhance performance introduces training instability, added complexity, and high GPU memory overhead due to the auxiliary trained models. To this end, we propose TwinFlow, a simple yet effective framework for training 1-step generative models that bypasses the need for fixed pretrained teacher models and avoids standard adversarial networks during training, making it ideal for building large-scale, efficient models. On text-to-image tasks, our method achieves a GenEval score of 0.83 in 1-NFE, outperforming strong baselines like SANA-Sprint (a GAN loss-based framework) and RCGM (a consistency-based framework). Notably, we demonstrate the scalability of TwinFlow by full-parameter training on Qwen-Image-20B and transform it into an efficient few-step generator. With just 1-NFE, our approach matches the performance of the original 100-NFE model on both the GenEval and DPG-Bench benchmarks, reducing computational cost by 100× with minor quality degradation."

### Alternatives you considered

_No response_

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] TwinFlow: Qwen Image and Z-Image-Turbo #1153

Feature Summary

Detailed Description

Alternatives you considered

Additional context

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Feature] TwinFlow: Qwen Image and Z-Image-Turbo #1153

Description

Feature Summary

Detailed Description

Alternatives you considered

Additional context

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions