OpenVINO GenAI (original) (raw)
OpenVINOGenAI
Run Generative AI with ease
OpenVINO™ GenAI provides optimized pipelines for running generative AI models with maximum performance and minimal dependencies
ov_pipe = ov_genai.LLMPipeline("TinyLlama")
print(ov_pipe.generate("The Sun is yellow because"))
ov_pipe = ov_genai.Text2ImagePipeline("Flux")
image = ov_pipe.generate("Create beautiful Sun")
ov_pipe = ov_genai.VLMPipeline("LLaVA")
print(ov_pipe.generate("Describe images", images))
ov_pipe = ov_genai.WhisperPipeline("whisper-base")
print(ov_pipe.generate(read_wav("sample.wav")))
ov_pipe = ov_genai.Text2SpeechPipeline("speecht5_tts")
result = ov_pipe.generate("Hello OpenVINO GenAI")
Features and Benefits
🚀
Optimized Performance
Built for speed with hardware-specific optimizations for Intel CPUs, GPUs, and NPUs. Advanced techniques like speculative decoding, KV-cache optimization, and other deliver maximum inference performance.
👨💻
Developer-Friendly APIs
Simple, intuitive APIs in both Python and C++ that hide complexity while providing full control. Get started with just a few lines of code, then customize with advanced features as needed.
📦
Production-Ready Pipelines
Pre-built pipelines for text generation, image creation, speech recognition, speech generation, and visual language processing. No need to build inference loops or handle tokenization - everything works out of the box.
🎨
Extensive Model Support
Compatible with popular models including Llama, Mistral, Phi, Qwen, Stable Diffusion, Flux, Whisper, etc. Easy model conversion from Hugging Face and ModelScope.
⚡
Lightweight & Efficient
Minimal dependencies and smaller disk footprint compared to heavyweight frameworks. Perfect for edge deployment, containers, and resource-constrained environments.
🖥️
Cross-Platform Compatibility
Run the same code on Linux, Windows, and macOS. Deploy across different hardware configurations without code changes - from laptops to data center servers.
Use Cases
Create chatbots, text summarization, content generation, and question-answering applications with state-of-the-art Large Language Models (LLMs).
Capabilities:
Control output with different generation parameters (sampling, temperature, etc.)
Optimize for conversational scenarios by using chat mode
Apply LoRA adapters and dynamically switch between them without recompilation
Accelerate generation using draft models via Speculative Decoding
Python
C++
JavaScript
import openvino_genai as ov_genai
pipe = ov_genai.LLMPipeline(model_path, "CPU")
print(pipe.generate("What is OpenVINO?", max_new_tokens=100))
Create and modify images with diffusion models for art generation, product design, and creative applications using Stable Diffusion and similar architectures.
Capabilities:
Support for text-to-image, image-to-image, and inpainting pipelines
Control image generation by adjusting parameters (dimensions, iterations, etc.)
Apply LoRA adapters and dynamically switch between them for artistic styles and modifications
Generate multiple images per one request
Python
C++
import openvino_genai as ov_genai
from PIL import Image
pipe = ov_genai.Text2ImagePipeline(model_path, "CPU")
image_tensor = pipe.generate(prompt)
image = Image.fromarray(image_tensor.data[0])
image.save("image.bmp")
Analyze and describe images with Vision Language Models (VLMs) to build AI assistants and tools for legal document review, medical analysis, document processing, and visual content understanding applications.
Capabilities:
Process single or multiple images in a single prompt with detailed text descriptions
Optimize for conversational scenarios by using chat mode
Control output with different generation parameters (sampling, temperature, etc.)
Python
C++
import openvino_genai as ov_genai
import openvino as ov
from PIL import Image
import numpy as np
from pathlib import Path
def read_image(path: str) -> ov.Tensor:
pic = Image.open(path).convert("RGB")
image_data = np.array(pic)[None]
return ov.Tensor(image_data)
def read_images(path: str) -> list[ov.Tensor]:
entry = Path(path)
if entry.is_dir():
return [read_image(str(file)) for file in sorted(entry.iterdir())]
return [read_image(path)]
images = read_images("./images")
pipe = ov_genai.VLMPipeline(model_path, "CPU")
result = pipe.generate(prompt, images=images, max_new_tokens=100)
print(result.texts[0])
# To input videos frames, use 'videos=', frames tensor layout = [Frame, H, W, C]
# result = pipe.generate(prompt, videos=[frames], max_new_tokens=100)
Convert speech to text using Whisper models for video transcription, meeting notes, multilingual audio content processing, and accessibility applications.
Capabilities:
Translate foreign language speech directly to English text
Transcribe audio in multiple languages with automatic language detection
Generate precise timestamps for synchronized subtitles and captions
Process long-form audio content (>30 seconds) efficiently
Python
C++
import openvino_genai as ov_genai
import librosa
def read_wav(filepath):
raw_speech, samplerate = librosa.load(filepath, sr=16000)
return raw_speech.tolist()
raw_speech = read_wav('sample.wav')
pipe = ov_genai.WhisperPipeline(model_path, "CPU")
result = pipe.generate(raw_speech, max_new_tokens=100)
print(result)
Convert text to speech using SpeechT5 TTS models.
Capabilities:
Generate natural and expressive speech from text prompts
Use speaker embeddings for personalized voice synthesis
Python
C++
import openvino_genai
import soundfile as sf
pipeline = openvino_genai.Text2SpeechPipeline(model_path, "CPU")
# Generate audio using the default speaker
result = pipeline.generate("Hello OpenVINO GenAI")
# speech tensor contains the waveform of the spoken phrase
speech = result.speeches[0]
sf.write("output_audio.wav", speech.data[0], samplerate=16000)
Generate vector representations for text using embedding models. Useful for semantic search, retrieval augmented generation (RAG).
Capabilities:
Select pooling strategies (CLS, MEAN)
Improve retrieval performance with L2 normalization
Provide embed and query instructions
Batch embedding for multiple documents
Python
C++
JavaScript
import openvino_genai as ov_genai
pipeline = ov_genai.TextEmbeddingPipeline(
models_path,
"CPU",
pooling_type = ov_genai.TextEmbeddingPipeline.PoolingType.MEAN,
normalize = True
)
documents_embeddings = pipeline.embed_documents(documents)
query_embeddings = pipeline.embed_query("What is the capital of France?")
Boost the relevance and accuracy of your Retrieval-Augmented Generation (RAG) workflows by reranking retrieved documents with the TextRerankPipeline.
Capabilities:
Reorder search results by semantic relevance to the query
Python
C++
import openvino_genai
pipeline = openvino_genai.TextRerankPipeline(model_path, "CPU", top_n=3)
rerank_result = pipeline.rerank(query, documents)
print("Reranked documents:")
for index, score in rerank_result:
print(f"Document {index} (score: {score:.4f}): {documents[index]}")
Install OpenVINO™ GenAI
Unlock the power of OpenVINO GenAI™ for your projects.
Get started with seamless installation now!
Quick Installation from PyPi
python -m pip install openvino-genai