Conversation
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| scenarios = [ |
There was a problem hiding this comment.
Covered the following scenarios:
- Regular BF16 with compilation
- NF4
- Layerwise upcasting
- Group offloading
|
Added SDXL, Wan (14B), and LTX (13B) on top of Flux: Results
|
|
Cc: @a-r-r-o-w if you want to add some caching benchmarks (in a later PR), I think that would be really great! |
|
@DN6 this is ready for a review. This is how the final CSV for this stage looks like: I have confirmed in this run that it works as expected: |
|
@DN6 a gentle ping. |
.github/workflows/benchmark.yml
Outdated
| workflow_dispatch: | ||
| schedule: | ||
| - cron: "30 1 1,15 * *" # every 2 weeks on the 1st and the 15th of every month at 1:30 AM | ||
| - cron: "0 17 * * 1" # every monday at 5 PM. |
There was a problem hiding this comment.
Not a blocker. But why run every week? Is a monthly benchmark not sufficient?
There was a problem hiding this comment.
True. Changing to bi-weekly.
|
@anijain2305 just a ping to let you know that we're merging this PR which will run the benchmarking suite bi-weekly and report the results here: https://huggingface.co/datasets/diffusers/benchmarks/blob/main/collated_results.csv |
|
Thanks for setting this up. This will be really helpful for tracking progress and identifying regression |
|
Since everything is passing now, will merge this PR :) |
|
Also cc @a-r-r-o-w for #11565 (comment) (not urgent, when you get time). |
* start overhauling the benchmarking suite. * fixes * fixes * checking. * checking * fixes. * error handling and logging. * add flops and params. * add more models. * utility to fire execution of all benchmarking scripts. * utility to push to the hub. * push utility improvement * seems to be working. * okay * add torchprofile dep. * remove total gpu memory * fixes * fix * need a big gpu * better * what's happening. * okay * separate requirements and make it nightly. * add db population script. * update secret name * update secret. * population db update * disable db population for now. * change to every monday * Update .github/workflows/benchmark.yml Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> * quality improvements. * reparate hub upload step. * repository * remove csv * check * update * update * threading. * update * update * updaye * update * update * update * remove peft dep * upgrade runner. * fix * fixes * fix merging csvs. * push dataset to the Space repo for analysis. * warm up. * add a readme * Apply suggestions from code review Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> * address feedback * Apply suggestions from code review * disable db workflow. * update to bi weekly. * enable population * enable * updaye * update * metadata * fix --------- Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> Co-authored-by: Luc Georges <McPatate@users.noreply.github.com>
What does this PR do?
This PR considerably simplifies how we do benchmarks. Instead of using entire pipeline-level benchmarks across different tasks, we will now ONLY benchmark the diffusion network that is the most compute-intensive part in a standard diffusion workflow.
To make the estimates more realistic, we will make use of pre-trained checkpoints and dummy inputs with reasonable dimensionalities.
I ran
benchmarking_flux.pyon an 80GB A100 on a batch size of 1 and got the following results:Analyze the results in this Space: https://huggingface.co/spaces/diffusers/benchmark-analyzer
By default, all benchmarks will use a batch size of 1, eliminating CFG.
How to add your benchmark?
Adding benchmarks for a new model class (
SanaTransformer2DModel, for example) boils down to the following:This is what
benchmarking_flux.pydoes. More modularization can be shipped afterward.Idea would be to merge this PR with pre-configured benchmarks for a few popular models and open others to the community.
TODOs
Utilities:
@DN6 could you give the approach a quick look? I can then work on resolving the TODOs.