Image Generation Using Diffusers | OpenVINO GenAI (original) (raw)
Convert and Optimize Model
Download and convert model (e.g. stabilityai/stable-diffusion-xl-base-1.0) to OpenVINO format from Hugging Face:
optimum-cli export openvino --model stabilityai/stable-diffusion-xl-base-1.0 --weight-format int4 --trust-remote-code stable_diffusion_xl_base_1_0_ov
See all supported Image Generation Models.
info
Refer to the Model Preparation guide for detailed instructions on how to download, convert and optimize models for OpenVINO GenAI.
Run Model Using OpenVINO GenAI
OpenVINO GenAI supports the following diffusion model pipelines:
- Text2ImagePipeline for creating images from text prompts.
- Image2ImagePipeline for modifying existing images based on prompts.
- InpaintingPipeline for selectively replacing portions of images using masks.
Text2ImagePipeline
Python
C++
CPU
GPU
import openvino_genai as ov_genai
from PIL import Image
pipe = ov_genai.Text2ImagePipeline(model_path, "CPU")
image_tensor = pipe.generate(prompt)
image = Image.fromarray(image_tensor.data[0])
image.save("image.bmp")
tip
Use CPU or GPU as devices without any other code change.
Image2ImagePipeline
Python
C++
CPU
GPU
import openvino_genai as ov_genai
import openvino as ov
from PIL import Image
import numpy as np
def read_image(path: str) -> ov.Tensor:
pic = Image.open(path).convert("RGB")
image_data = np.array(pic)[None]
return ov.Tensor(image_data)
input_image_data = read_image("input_image.jpg")
pipe = ov_genai.Image2ImagePipeline(model_path, "CPU")
image_tensor = pipe.generate(prompt, image=input_image_data, strength=0.8)
image = Image.fromarray(image_tensor.data[0])
image.save("image.bmp")
InpaintingPipeline
Python
C++
CPU
GPU
import openvino_genai as ov_genai
import openvino as ov
from PIL import Image
import numpy as np
def read_image(path: str) -> ov.Tensor:
pic = Image.open(path).convert("RGB")
image_data = np.array(pic)[None]
return ov.Tensor(image_data)
input_image_data = read_image("input_image.jpg")
mask_image = read_image("mask.jpg")
pipe = ov_genai.InpaintingPipeline(model_path, "CPU")
image_tensor = pipe.generate(prompt, image=input_image_data, mask_image=mask_image)
image = Image.fromarray(image_tensor.data[0])
image.save("image.bmp")
Additional Usage Options
tip
Check out Python and C++ image generation samples.
Use Different Generation Parameters
Generation Configuration Workflow
- Get the model default config with
get_generation_config() - Modify parameters
- Apply the updated config using one of the following methods:
- Use
set_generation_config(config) - Pass config directly to
generate()(e.g.generate(prompt, config)) - Specify options as inputs in the
generate()method (e.g.generate(prompt, max_new_tokens=100))
- Use
Image Generation Configuration
You can adjust several parameters to control the image generation process, including dimensions and the number of inference steps:
- Python
- C++
import openvino_genai as ov_genai
from PIL import Image
pipe = ov_genai.Text2ImagePipeline(model_path, "CPU")
image_tensor = pipe.generate(
prompt,
width=512,
height=512,
num_images_per_prompt=1,
num_inference_steps=30,
guidance_scale=7.5
)
image = Image.fromarray(image_tensor.data[0])
image.save("image.bmp")
Understanding Image Generation Parameters
width: The width of resulting image(s).height: The height of resulting image(s).num_images_per_prompt: Specifies how many image variations to generate in a single request for the same prompt.num_inference_steps: Defines denoising iteration count. Higher values increase quality and generation time, lower values generate faster with less detail.guidance_scale: Balances prompt adherence vs. creativity. Higher values follow prompt more strictly, lower values allow more creative freedom.rng_seed: Controls randomness for reproducible results. Same seed produces identical images across runs.
For the full list of generation parameters, refer to the Image Generation Config API.
Working with LoRA Adapters
For image generation models like Stable Diffusion, LoRA adapters can modify the generation process to produce images with specific artistic styles, content types, or quality enhancements.
Refer to the LoRA Adapters for more details on working with LoRA adapters.