Release v0.2.9
The juiciest bits 🚀
feat: add flash_attention 3 kernel for diffusers pipelines by @johannaSommer in #287
We've added flash attention 3 to our new algorithm group "kernels". With the help of huggingface's kernel hub and pruna, you can now use flash attention 3 for any diffusers pipeline. Speed ups will vary based on the pipeline you are smashing, but we recommend it specifically for video generation pipelines like Wan!
feat: enhance model checks for transformers pipelines by @davidberenstein1957 in #281
We extended multiple algorithms to not support directly smashing a transformers pipeline without extracting the underlying model, simply give it to smash() and we will do the rest.
replace os.path with pathlib.Path by @GreatBahram in #260
@GreatBahram helped us finally with to pathlib and the code is looking cleaner than ever! 💅🏻
Pruning some bugs 🐞 and maintenance 🧑🌾
- test: connect inference/eval tests to algorithms by @begumcig in #181
- tests: update fixtures for algorithm and evaluation tests by @johannaSommer in #288
- fix: device placement with indexed devices by @davidberenstein1957 in #205
- feat: 277 feature update modelcard to include a snippet and base model by @davidberenstein1957 in #282
- tests: update audio datasets, add sdxl as example model by @johannaSommer in #293
- fix: failures from device indexing and evaluation testing by @johannaSommer in #300
Full Changelog: v0.2.8...v0.2.9