[benchmarks] overhaul benchmarks by sayakpaul · Pull Request #11565 · huggingface/diffusers (original) (raw)

What does this PR do?

This PR considerably simplifies how we do benchmarks. Instead of using entire pipeline-level benchmarks across different tasks, we will now ONLY benchmark the diffusion network that is the most compute-intensive part in a standard diffusion workflow.

To make the estimates more realistic, we will make use of pre-trained checkpoints and dummy inputs with reasonable dimensionalities.

I ran benchmarking_flux.py on an 80GB A100 on a batch size of 1 and got the following results:

Analyze the results in this Space: https://huggingface.co/spaces/diffusers/benchmark-analyzer

By default, all benchmarks will use a batch size of 1, eliminating CFG.

How to add your benchmark?

Adding benchmarks for a new model class (SanaTransformer2DModel, for example) boils down to the following:

Define the dummy inputs of the model.
Define the benchmarking scenarios we should run the benchmark on.

This is what benchmarking_flux.py does. More modularization can be shipped afterward.

Idea would be to merge this PR with pre-configured benchmarks for a few popular models and open others to the community.

TODOs

Utilities:

To fire the execution of the individual model-level benchmarks sequentially.
To combine CSVs from multiple different model classes.
Central dataset update and Slack notification.

@DN6 could you give the approach a quick look? I can then work on resolving the TODOs.