Add circular padding for artifact-free StableDiffusionPanoramaPipeline by EvgenyKashin · Pull Request #4025 · huggingface/diffusers (original) (raw)
Artifact-Free Panorama Generation with circular padding
This pull request addresses the issue mentioned in #4024. The proposed solution aims to achieve artifact-free panorama generation by seamlessly aligning the content on the left and right sides of the generated image, creating a proper “panorama”, viewable in 360 panoramas viewers.
In the existing MultiDiffusion approach, the image is cropped from left to right with a specified stride. To enable circular padding and support the desired 360-degree scenario, this PR introduces additional crops after the rightmost point of the image. Circular padding allows the model to “see” the transition from the rightmost part of the image to the leftmost part, maintaining consistency in 360-sense.
Furthermore, modifications are made to the decoding of latents in the case of StableDiffusion. Without these modifications, the decoded latents would not match in the RGB space, even if they align well in the latent space. Thus, the decoding of latents also incorporates circular padding to ensure consistent results.
A panorama image generated without circular padding:
A panorama image generated with circular padding:
To maintain backward compatibility and preserve the previous behaviour, the proposed circular padding functionality is implemented as an optional argument “circular_padding”, which is set to False by default.
Usage example:
import torch from diffusers import StableDiffusionPanoramaPipeline, DDIMScheduler
model_ckpt = "stabilityai/stable-diffusion-2-base" scheduler = DDIMScheduler.from_pretrained(model_ckpt, subfolder="scheduler") pipe = StableDiffusionPanoramaPipeline.from_pretrained(model_ckpt, scheduler=scheduler, torch_dtype=torch.float16) pipe = pipe.to("cuda")
prompt = "an indoor equirectangular panorama of a living room"
image = pipe(prompt, width=1024, height=512, circular_padding=True).images[0]