[benchmarks] overhaul benchmarks by sayakpaul · Pull Request #11565 · huggingface/diffusers (original) (raw)
What does this PR do?
This PR considerably simplifies how we do benchmarks. Instead of using entire pipeline-level benchmarks across different tasks, we will now ONLY benchmark the diffusion network that is the most compute-intensive part in a standard diffusion workflow.
To make the estimates more realistic, we will make use of pre-trained checkpoints and dummy inputs with reasonable dimensionalities.
I ran benchmarking_flux.py on an 80GB A100 on a batch size of 1 and got the following results:
Analyze the results in this Space: https://huggingface.co/spaces/diffusers/benchmark-analyzer
By default, all benchmarks will use a batch size of 1, eliminating CFG.
How to add your benchmark?
Adding benchmarks for a new model class (SanaTransformer2DModel, for example) boils down to the following:
- Define the dummy inputs of the model.
- Define the benchmarking scenarios we should run the benchmark on.
This is what benchmarking_flux.py does. More modularization can be shipped afterward.
Idea would be to merge this PR with pre-configured benchmarks for a few popular models and open others to the community.
TODOs
Utilities:
- To fire the execution of the individual model-level benchmarks sequentially.
- To combine CSVs from multiple different model classes.
- Central dataset update and Slack notification.
@DN6 could you give the approach a quick look? I can then work on resolving the TODOs.
