Skip to content

Release v0.2.9

Choose a tag to compare

@johannaSommer johannaSommer released this 13 Aug 14:35
· 140 commits to main since this release
d37c2d3

The juiciest bits 🚀

feat: add flash_attention 3 kernel for diffusers pipelines by @johannaSommer in #287

We've added flash attention 3 to our new algorithm group "kernels". With the help of huggingface's kernel hub and pruna, you can now use flash attention 3 for any diffusers pipeline. Speed ups will vary based on the pipeline you are smashing, but we recommend it specifically for video generation pipelines like Wan!

feat: enhance model checks for transformers pipelines by @davidberenstein1957 in #281

We extended multiple algorithms to not support directly smashing a transformers pipeline without extracting the underlying model, simply give it to smash() and we will do the rest.

replace os.path with pathlib.Path by @GreatBahram in #260

@GreatBahram helped us finally with to pathlib and the code is looking cleaner than ever! 💅🏻

Pruning some bugs 🐞 and maintenance 🧑‍🌾

Full Changelog: v0.2.8...v0.2.9