-
How can I display each generation step lke in InvokeAI UI? I've found one solution, but it seems it's old, because callback and callback_steps are deprecated. Anyway that code was working, but it hanged up Google Colab. Here is the code I'm talking about: `def callback(iter, t, latents):
But I've found another code, it's don't hanging Google Colab, but the generation's noise not pretty, not what I want: can I make something like InvokeAI? |
Beta Was this translation helpful? Give feedback.
Replies: 6 comments 7 replies
-
I haven't looked at how InvokeAI does it, but is not that hard to do. First of all, you can't just convert the For starters, you'll need to use the def decode_tensors(pipe, step, timestep, callback_kwargs):
latents = callback_kwargs["latents"]
# convert latents to an image
return callback_kwargs
image = pipe(
height=image_height,
width=image_width,
prompt=prompt,
negative_prompt="",
guidance_scale=7.5,
num_inference_steps=20,
generator=generator,
callback_on_step_end=decode_tensors,
callback_on_step_end_tensor_inputs=["latents"],
).images[0] Then to convert the latents to images, what you're doing is one method but very resource intensive, I don't know what are you doing in your code since you use a So what you're doing is not practical but I know two other methods to achieve what your looking for: 1.- You can use the function in this blog post: https://huggingface.co/blog/TimothyAlexisVass/explaining-the-sdxl-latent-space. The author also explains the process and what each dimension and channel means in the latents, but if you just want the function to do it is this one: def latents_to_rgb(latents):
weights = (
(60, -60, 25, -70),
(60, -5, 15, -50),
(60, 10, -5, -35)
)
weights_tensor = torch.t(torch.tensor(weights, dtype=latents.dtype).to(latents.device))
biases_tensor = torch.tensor((150, 140, 130), dtype=latents.dtype).to(latents.device)
rgb_tensor = torch.einsum("...lxy,lr -> ...rxy", latents, weights_tensor) + biases_tensor.unsqueeze(-1).unsqueeze(-1)
image_array = rgb_tensor.clamp(0, 255)[0].byte().cpu().numpy()
image_array = image_array.transpose(1, 2, 0) # Change the order of dimensions
return Image.fromarray(image_array) The only con of this method is that the latent space is a compressed space of 128x128, so this images are of 128x128 as well, very small to do some decent previews in big windows, good and fast for small previews. 2.- Use taesd which is a really small and fast autoencoder. The con of this method is that you'll need to download the weights and the know-how of loading the model, but once you have that is as easy as this: with torch.no_grad():
decoded = taesd(latents.float()).clamp(0, 1).mul_(255).round().byte()
image = Image.fromarray(decoded[0].permute(1, 2, 0).cpu().numpy()) The image would be the same as the generation with just lower quality in the details, but is good enough for previews. I remember I saw somewhere they added TAESD to diffusers but I really haven't used it that way. This would be a comparison: RGB
TAESD
|
Beta Was this translation helpful? Give feedback.
-
@asomoza thanks so much. I am going to use this myself as well. @yiyixuxu @stevhliu what do you think about adding a guide to our docs crediting @asomoza and https://huggingface.co/blog/TimothyAlexisVass/explaining-the-sdxl-latent-space? |
Beta Was this translation helpful? Give feedback.
-
yeah I would use this too! I want to collect more of these kinds of examples from the community (about how to use our callback loops to implement cool features like this one!) - I think the rest of us can benefit greatly from a "gallery" like this I wonder what would be the best platform for this? should we create a folder in the community folder? or maybe a delicate section on the doc? |
Beta Was this translation helpful? Give feedback.
-
I agree that |
Beta Was this translation helpful? Give feedback.
-
@sayakpaul My comment is about finding a "platform" for the community to share more short code snippets like this one, using the callback argument to implement cool features. |
Beta Was this translation helpful? Give feedback.
-
Hey @asomoza, great work with all the tips and tricks you've been supplying in the discussions! We would love to collaborate more with you over Slack to add these to the docs so everyone benefits from them. Is there an email we can invite you with? 🙂 |
Beta Was this translation helpful? Give feedback.
for the first method you just need to put the code I gave you together: