Philosophy (original) (raw)

🧨 Diffusers provides state-of-the-art pretrained diffusion models across multiple modalities. Its purpose is to serve as a modular toolbox for both inference and training.

We aim at building a library that stands the test of time and therefore take API design very seriously.

In a nutshell, Diffusers is built to be a natural extension of PyTorch. Therefore, most of our design choices are based on PyTorch’s Design Principles. Let’s go over the most important ones:

Usability over Performance

Simple over easy

As PyTorch states, explicit is better than implicit and simple is better than complex. This design philosophy is reflected in multiple parts of the library:

Tweakable, contributor-friendly over abstraction

For large parts of the library, Diffusers adopts an important design principle of the Transformers library, which is to prefer copy-pasted code over hasty abstractions. This design principle is very opinionated and stands in stark contrast to popular design principles such as Don’t repeat yourself (DRY). In short, just like Transformers does for modeling files, Diffusers prefers to keep an extremely low level of abstraction and very self-contained code for pipelines and schedulers. Functions, long code blocks, and even classes can be copied across multiple files which at first can look like a bad, sloppy design choice that makes the library unmaintainable.However, this design has proven to be extremely successful for Transformers and makes a lot of sense for community-driven, open-source machine learning libraries because:

At Hugging Face, we call this design the single-file policy which means that almost all of the code of a certain class should be written in a single, self-contained file. To read more about the philosophy, you can have a look at this blog post.

In Diffusers, we follow this philosophy for both pipelines and schedulers, but only partly for diffusion models. The reason we don’t follow this design fully for diffusion models is because almost all diffusion pipelines, such as DDPM, Stable Diffusion, unCLIP (DALL·E 2) and Imagen all rely on the same diffusion model, the UNet.

Great, now you should have generally understood why 🧨 Diffusers is designed the way it is 🤗. We try to apply these design principles consistently across the library. Nevertheless, there are some minor exceptions to the philosophy or some unlucky design choices. If you have feedback regarding the design, we would ❤️ to hear it directly on GitHub.

Design Philosophy in Details

Now, let’s look a bit into the nitty-gritty details of the design philosophy. Diffusers essentially consists of three major classes: pipelines, models, and schedulers. Let’s walk through more in-detail design decisions for each class.

Pipelines

Pipelines are designed to be easy to use (therefore do not follow Simple over easy 100%), are not feature complete, and should loosely be seen as examples of how to use models and schedulers for inference.

The following design principles are followed:

Models

Models are designed as configurable toolboxes that are natural extensions of PyTorch’s Module class. They only partly follow the single-file policy.

The following design principles are followed:

Schedulers

Schedulers are responsible to guide the denoising process for inference as well as to define a noise schedule for training. They are designed as individual classes with loadable configuration files and strongly follow the single-file policy.

The following design principles are followed:

< > Update on GitHub