Pipelines (original) (raw)

Pipelines provide a simple way to run state-of-the-art diffusion models in inference by bundling all of the necessary components (multiple independently-trained models, schedulers, and processors) into a single end-to-end class. Pipelines are flexible and they can be adapted to use different schedulers or even model components.

All pipelines are built from the base DiffusionPipeline class which provides basic functionality for loading, downloading, and saving all the components. Specific pipeline types (for example StableDiffusionPipeline) loaded with from_pretrained() are automatically detected and the pipeline components are loaded and passed to the __init__ function of the pipeline.

You shouldn’t use the DiffusionPipeline class for training. Individual components (for example, UNet2DModel and UNet2DConditionModel) of diffusion pipelines are usually trained individually, so we suggest directly working with them instead.

Pipelines do not offer any training functionality. You’ll notice PyTorch’s autograd is disabled by decorating the __call__() method with a torch.no_grad decorator because pipelines should not be used for training. If you’re interested in training, please take a look at the Training guides instead!

The table below lists all the pipelines currently available in 🤗 Diffusers and the tasks they support. Click on a pipeline to view its abstract and published paper.

Pipeline Tasks
aMUSEd text2image
AnimateDiff text2video
Attend-and-Excite text2image
AudioLDM text2audio
AudioLDM2 text2audio
AuraFlow text2image
BLIP Diffusion text2image
CogVideoX text2video
Consistency Models unconditional image generation
ControlNet text2image, image2image, inpainting
ControlNet with Flux.1 text2image
ControlNet with Hunyuan-DiT text2image
ControlNet with Stable Diffusion 3 text2image
ControlNet with Stable Diffusion XL text2image
ControlNet-XS text2image
ControlNet-XS with Stable Diffusion XL text2image
Dance Diffusion unconditional audio generation
DDIM unconditional image generation
DDPM unconditional image generation
DeepFloyd IF text2image, image2image, inpainting, super-resolution
DiffEdit inpainting
DiT text2image
Flux text2image
Hunyuan-DiT text2image
I2VGen-XL text2video
InstructPix2Pix image editing
Kandinsky 2.1 text2image, image2image, inpainting, interpolation
Kandinsky 2.2 text2image, image2image, inpainting
Kandinsky 3 text2image, image2image
Kolors text2image
Latent Consistency Models text2image
Latent Diffusion text2image, super-resolution
Latte text2image
LEDITS++ image editing
Lumina-T2X text2image
Marigold depth
MultiDiffusion text2image
MusicLDM text2audio
PAG text2image
Paint by Example inpainting
PIA image2video
PixArt-α text2image
PixArt-Σ text2image
Self-Attention Guidance text2image
Semantic Guidance text2image
Shap-E text-to-3D, image-to-3D
Stable Audio text2audio
Stable Cascade text2image
Stable Diffusion text2image, image2image, depth2image, inpainting, image variation, latent upscaler, super-resolution
Stable Diffusion XL text2image, image2image, inpainting
Stable Diffusion XL Turbo text2image, image2image, inpainting
Stable unCLIP text2image, image variation
T2I-Adapter text2image
Text2Video text2video, video2video
Text2Video-Zero text2video
unCLIP text2image, image variation
UniDiffuser text2image, image2text, image variation, text variation, unconditional image generation, unconditional audio generation
Value-guided planning value guided sampling
Wuerstchen text2image

DiffusionPipeline

Base class for all pipelines.

DiffusionPipeline stores all components (models, schedulers, and processors) for diffusion pipelines and provides methods for loading, downloading and saving models. It also includes methods to:

Class attributes:

__call__

( *args **kwargs )

Call self as a function.

device

< source >

( ) → torch.device

The torch device on which the pipeline is located.

to

< source >

( *args **kwargs ) → DiffusionPipeline

Parameters

The pipeline converted to specified dtype and/or dtype.

Performs Pipeline dtype and/or device conversion. A torch.dtype and torch.device are inferred from the arguments of self.to(*args, **kwargs).

If the pipeline already has the correct torch.dtype and torch.device, then it is returned as is. Otherwise, the returned pipeline is a copy of self with the desired torch.dtype and torch.device.

Here are the ways to call to:

The self.components property can be useful to run different pipelines with the same weights and configurations without reallocating additional memory.

Returns (dict): A dictionary containing all the modules needed to initialize the pipeline.

Examples:

from diffusers import ( ... StableDiffusionPipeline, ... StableDiffusionImg2ImgPipeline, ... StableDiffusionInpaintPipeline, ... )

text2img = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5") img2img = StableDiffusionImg2ImgPipeline(**text2img.components) inpaint = StableDiffusionInpaintPipeline(**text2img.components)

Disable sliced attention computation. If enable_attention_slicing was previously called, attention is computed in one step.

disable_xformers_memory_efficient_attention

< source >

( )

Disable memory efficient attention from xFormers.

download

< source >

( pretrained_model_name **kwargs ) → os.PathLike

Parameters

A path to the downloaded pipeline.

Download and cache a PyTorch diffusion pipeline from pretrained pipeline weights.

To use private or gated models, log-in withhuggingface-cli login.

enable_attention_slicing

< source >

( slice_size: Union = 'auto' )

Parameters

Enable sliced attention computation. When this option is enabled, the attention module splits the input tensor in slices to compute attention in several steps. For more than one attention head, the computation is performed sequentially over each head. This is useful to save some memory in exchange for a small speed decrease.

⚠️ Don’t enable attention slicing if you’re already using scaled_dot_product_attention (SDPA) from PyTorch 2.0 or xFormers. These attention computations are already very memory efficient so you won’t need to enable this function. If you enable attention slicing with SDPA or xFormers, it can lead to serious slow downs!

Examples:

import torch from diffusers import StableDiffusionPipeline

pipe = StableDiffusionPipeline.from_pretrained( ... "runwayml/stable-diffusion-v1-5", ... torch_dtype=torch.float16, ... use_safetensors=True, ... )

prompt = "a photo of an astronaut riding a horse on mars" pipe.enable_attention_slicing() image = pipe(prompt).images[0]

enable_model_cpu_offload

< source >

( gpu_id: Optional = None device: Union = 'cuda' )

Parameters

Offloads all models to CPU using accelerate, reducing memory usage with a low impact on performance. Compared to enable_sequential_cpu_offload, this method moves one whole model at a time to the GPU when its forwardmethod is called, and the model remains in GPU until the next model runs. Memory savings are lower than withenable_sequential_cpu_offload, but performance is much better due to the iterative execution of the unet.

enable_sequential_cpu_offload

< source >

( gpu_id: Optional = None device: Union = 'cuda' )

Parameters

Offloads all models to CPU using 🤗 Accelerate, significantly reducing memory usage. When called, the state dicts of all torch.nn.Module components (except those in self._exclude_from_cpu_offload) are saved to CPU and then moved to torch.device('meta') and loaded to GPU only when their specific submodule has its forwardmethod called. Offloading happens on a submodule basis. Memory savings are higher than withenable_model_cpu_offload, but performance is lower.

enable_xformers_memory_efficient_attention

< source >

( attention_op: Optional = None )

Parameters

Enable memory efficient attention from xFormers. When this option is enabled, you should observe lower GPU memory usage and a potential speed up during inference. Speed up during training is not guaranteed.

⚠️ When memory efficient attention and sliced attention are both enabled, memory efficient attention takes precedent.

Examples:

import torch from diffusers import DiffusionPipeline from xformers.ops import MemoryEfficientAttentionFlashAttentionOp

pipe = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1", torch_dtype=torch.float16) pipe = pipe.to("cuda") pipe.enable_xformers_memory_efficient_attention(attention_op=MemoryEfficientAttentionFlashAttentionOp)

pipe.vae.enable_xformers_memory_efficient_attention(attention_op=None)

from_pipe

< source >

( pipeline **kwargs ) → DiffusionPipeline

Parameters

Returns

DiffusionPipeline

A new pipeline with the same weights and configurations as pipeline.

Create a new pipeline from a given pipeline. This method is useful to create a new pipeline from the existing pipeline components without reallocating additional memory.

Examples:

from diffusers import StableDiffusionPipeline, StableDiffusionSAGPipeline

pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5") new_pipe = StableDiffusionSAGPipeline.from_pipe(pipe)

from_pretrained

< source >

( pretrained_model_name_or_path: Union **kwargs )

Parameters

Instantiate a PyTorch diffusion pipeline from pretrained pipeline weights.

The pipeline is set in evaluation mode (model.eval()) by default.

If you get the error message below, you need to finetune the weights for your downstream task:

Some weights of UNet2DConditionModel were not initialized from the model checkpoint at runwayml/stable-diffusion-v1-5 and are newly initialized because the shapes did not match:

To use private or gated models, log-in withhuggingface-cli login.

Examples:

from diffusers import DiffusionPipeline

pipeline = DiffusionPipeline.from_pretrained("CompVis/ldm-text2im-large-256")

pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")

from diffusers import LMSDiscreteScheduler

scheduler = LMSDiscreteScheduler.from_config(pipeline.scheduler.config) pipeline.scheduler = scheduler

Function that offloads all components, removes all model hooks that were added when usingenable_model_cpu_offload and then applies them again. In case the model has not been offloaded this function is a no-op. Make sure to add this function to the end of the __call__ function of your pipeline so that it functions correctly when applying enable_model_cpu_offload.

Convert a NumPy image or a batch of images to a PIL image.

Removes all hooks that were added when using enable_sequential_cpu_offload or enable_model_cpu_offload.

Resets the device maps (if any) to None.

save_pretrained

< source >

( save_directory: Union safe_serialization: bool = True variant: Optional = None max_shard_size: Union = None push_to_hub: bool = False **kwargs )

Parameters

Save all saveable variables of the pipeline to a directory. A pipeline variable can be saved and loaded if its class implements both a save and loading method. The pipeline is easily reloaded using thefrom_pretrained() class method.

diffusers.StableDiffusionMixin.enable_freeu

< source >

( s1: float s2: float b1: float b2: float )

Parameters

Enables the FreeU mechanism as in https://arxiv.org/abs/2309.11497.

The suffixes after the scaling factors represent the stages where they are being applied.

Please refer to the official repository for combinations of the values that are known to work well for different pipelines such as Stable Diffusion v1, v2, and Stable Diffusion XL.

diffusers.StableDiffusionMixin.disable_freeu

< source >

( )

Disables the FreeU mechanism if enabled.

FlaxDiffusionPipeline

class diffusers.FlaxDiffusionPipeline

< source >

( )

Base class for Flax-based pipelines.

FlaxDiffusionPipeline stores all components (models, schedulers, and processors) for diffusion pipelines and provides methods for loading, downloading and saving models. It also includes methods to:

Class attributes:

from_pretrained

< source >

( pretrained_model_name_or_path: Union **kwargs )

Parameters

Instantiate a Flax-based diffusion pipeline from pretrained pipeline weights.

The pipeline is set in evaluation mode (`model.eval()) by default and dropout modules are deactivated.

If you get the error message below, you need to finetune the weights for your downstream task:

Some weights of FlaxUNet2DConditionModel were not initialized from the model checkpoint at runwayml/stable-diffusion-v1-5 and are newly initialized because the shapes did not match:

To use private or gated models, log-in withhuggingface-cli login.

Examples:

from diffusers import FlaxDiffusionPipeline

pipeline, params = FlaxDiffusionPipeline.from_pretrained( ... "runwayml/stable-diffusion-v1-5", ... variant="bf16", ... dtype=jnp.bfloat16, ... )

from diffusers import FlaxDPMSolverMultistepScheduler

model_id = "runwayml/stable-diffusion-v1-5" dpmpp, dpmpp_state = FlaxDPMSolverMultistepScheduler.from_pretrained( ... model_id, ... subfolder="scheduler", ... )

dpm_pipe, dpm_params = FlaxStableDiffusionPipeline.from_pretrained( ... model_id, variant="bf16", dtype=jnp.bfloat16, scheduler=dpmpp ... ) dpm_params["scheduler"] = dpmpp_state

Convert a NumPy image or a batch of images to a PIL image.

save_pretrained

< source >

( save_directory: Union params: Union push_to_hub: bool = False **kwargs )

Parameters

Save all saveable variables of the pipeline to a directory. A pipeline variable can be saved and loaded if its class implements both a save and loading method. The pipeline is easily reloaded using thefrom_pretrained() class method.

PushToHubMixin

class diffusers.utils.PushToHubMixin

< source >

( )

A Mixin to push a model, scheduler, or pipeline to the Hugging Face Hub.

push_to_hub

< source >

( repo_id: str commit_message: Optional = None private: Optional = None token: Optional = None create_pr: bool = False safe_serialization: bool = True variant: Optional = None )

Parameters

Upload model, scheduler, or pipeline files to the 🤗 Hugging Face Hub.

Examples:

from diffusers import UNet2DConditionModel

unet = UNet2DConditionModel.from_pretrained("stabilityai/stable-diffusion-2", subfolder="unet")

unet.push_to_hub("my-finetuned-unet")

unet.push_to_hub("your-org/my-finetuned-unet")

< > Update on GitHub