Skip to content

Conversation

@Beinsezii
Copy link
Contributor

What does this PR do?

Fixes a small performance regression for Z Image Turbo.

Basically just sets attn_mask to None when it would otherwise be all ones, which is always the case for Z Image Turbo where guidance_scale==1 for typical usage.

On an H100 this improves performance by about 4%, using AttentionBackendName._NATIVE_CUDNN.

Before submitting

Who can review?

@yiyixuxu or @sayakpaul probably

@sayakpaul
Copy link
Member

Cc: @JerryWu-code who contributed the model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants