OpenVINO Release Notes — OpenVINO™ documentation (original) (raw)

2025.4 - 1 December 2025#

System Requirements | Release policy | Installation Guides

What’s new#

OpenVINO™ Runtime#

CPU Device Plugin#

GPU Device Plugin#

NPU Device Plugin#

OpenVINO Python API#

OpenVINO C API#

OpenVINO Node.js API#

PyTorch Framework Support#

ONNX Framework Support#

OpenVINO™ Model Server#

ovms -pull -task text_generation OpenVINO/Qwen3-8B-int4 ovms -list_models ovms -add_to_models -model_name OpenVINO/Qwen3-8B-int4 ovms -remove_from_models -model_name OpenVINO/Qwen3-8B-int4

Performance improvements:

Audio endpoints added:

Embeddings endpoints improvements:

Breaking changes:

Bug fixes:

Neural Network Compression Framework#

OpenVINO Tokenizers#

OpenVINO GenAI#

Other Changes and Known Issues#

Jupyter Notebooks#

New models and use cases:

Known Issues#

Component: OpenVINO Tokenizers

ID: 174531

Description:

Accuracy regression of Mistral-7b-instruct-v0.2 and Mistral-7b-instruct-v0.3 on all devices when executed with OpenVINO GenAI. As a workaround, use the IR converted with OpenVINO 2025.3. The accuracy will be improved with the next release.

Component: OpenVINO GenAI

ID: 176777

Description:

Using the callback parameter with the Python API call generate() in Text2ImagePipeline, Image2ImagePipeline, InpaintingPipeline may cause the process to hang. As a workaround, do not use the callback parameter. The issue will be resolved in the next release. C++ implementations are not affected.

Previous 2025 releases#

OpenVINO™ Runtime

Common

CPU Device Plugin

OpenVINO™ Runtime

Common

AUTO Inference Mode

CPU Device Plugin

GPU Device Plugin

NPU Device Plugin

OpenVINO Python API

OpenVINO C API

OpenVINO Node.js API

PyTorch Framework Support

OpenVINO Model Server

Neural Network Compression Framework

OpenVINO Tokenizers

  • Regex-based normalization and split operations have been optimized, resulting in significant speed improvements, especially for long input strings.
  • Two-string inputs are now supported, enabling various tasks, including RAG reranking.
  • Sentencepiece char-level tokenizers are now supported to enhance the SpeechT5 TTS model.
  • The tokenization node factory has been exposed to enable OpenVINO GenAI GGUF support.

OpenVINO.GenAI

  • New preview pipelines with C++ and Python samples have been added:
  • Text2SpeechPipeline,
  • TextEmbeddingPipeline covering RAG scenario.
  • Visual language modeling (VLMPipeline):
  • VLM prompt can now refer to specific images. For example, <ov_genai_image_0>What’s in the image? will prepend the corresponding image to the prompt
    while ignoring other images. See VLMPipeline’s docstrings for more details.
  • VLM uses continuous batching by default, improving performance.
  • VLMPipeline can now be constructed from in-memory ov::Model.
  • Qwen2.5-VL support has been added.
  • JavaScript:
  • JavaScript samples have been added: beam_search_causal_lm and multinomial_causal_lm.
  • An interruption option for LLMPipeline streaming has been introduced.
  • The following has been added:
  • cache encryption samples demonstrating how to encode OpenVINO’s cached compiled model,
  • LLM ReAct Agent sample capable of calling external functions during text generation,
  • SD3 LoRA Adapter support for Text2ImagePipeline,
  • ov::genai::Tokenizer::get_vocab() method for C++ and Python,
  • ov::Property as arguments to the ov_genai_llm_pipeline_create function for the C API,
  • support for the SnapKV method for more accurate KV cache eviction, enabled by default when KV cache eviction is used,
  • preview support for GGUF models (GGML Unified Format). See the OpenVINO blog for details.

Other Changes and Known Issues

Jupyter Notebooks

OpenVINO™ Runtime

Common

CPU Device Plugin

GPU Device Plugin

NPU Device Plugin

OpenVINO Python API

OpenVINO Node.js API

PyTorch Framework Support

JAX Framework Support

Keras 3 Multi-backend Framework Support

TensorFlow Lite Framework Support

OpenVINO Model Server

Neural Network Compression Framework

OpenVINO Tokenizers

OpenVINO.GenAI

Other Changes and Known Issues

Windows PDB Archives:

Archives containing PDB files for Windows packages are now available.

You can find them right next to the regular archives, in the same folder.

Jupyter Notebooks

Known Issues

Component: NPU

ID: n/a

Description:

For LLM runs with prompts longer than the user may set through the MAX_PROMPT_LEN parameter, an exception occurs, with a note providing the reason. In the current version of OpenVINO, the message is not correct. in future releases, the explanation will be fixed.

Component: NPU

ID: 164469

Description:

With the NPU Linux driver release v1.13.0, a new behavior for NPU recovery in kernel has been introduced. Corresponding changes in Ubuntu kernels are pending, targeting new kernel releases.

Workaround:

If inference on NPU crashes, a manual reload of the driver is a recommended optionsudo rmmod intel_vpu sudo modprobe intel_vpu. A rollback to an earlier version of Linux NPU driver will also work.

Component: GPU

ID: 164331

Description:

Qwen2-VL model crashes on some Intel platforms when large inputs are used.

Workaround:

Build OpenVINO GenAI from source.

Component: OpenVINO GenAI

ID: 165686

Description:

In the VLM ContinuousBatching pipeline, when multiple requests are processed using add_request() and step() API in multiple threads, the resulting text is not correct.

Workaround:

Build OpenVINO GenAI from source.

OpenVINO™ Runtime

Common

AUTO Inference Mode

CPU Device Plugin

GPU Device Plugin

NPU Device Plugin

OpenVINO Python API

OpenVINO Node.js API

TensorFlow Framework Support

PyTorch Framework Support

OpenVINO Python API

Keras 3 Multi-backend Framework Support

ONNX Framework Support

OpenVINO Model Server

Neural Network Compression Framework

OpenVINO Tokenizers

OpenVINO.GenAI

The following has been added:

Other Changes and Known Issues

Jupyter Notebooks

Known Issues

Component: OVC

ID: 160167

Description:

TensorFlow Object Detection models converted to the IR through the OVC tool gives poor performance on CPU, GPU, and NPU devices. As a workaround, please use the MO tool from 2024.6 or earlier to generate IRs.

Component: Tokenizers

ID: 159392

Description:

ONNX model fails to convert when openvino-tokenizers is installed. As a workaround please uninstall openvino-tokenizers to convert ONNX model to the IR.

Component: CPU Plugin

ID: 161336

Description:

Compilation of an openvino model performing weight quantization fails with Segmentation Fault on Intel® Core™ Ultra 200V processors. The following workaround can be applied to make it work with existing OV versions (including 25.0 RCs) before application run: export DNNL_MAX_CPU_ISA=AVX2_VNNI.

Component: GPU Plugin

ID: 160802

Description:

mllama model crashes on Intel® Core™ Ultra 200V processors. Please use OpenVINO 2024.6 or earlier to run the model.

Component: GPU Plugin

ID: 160948

Description:

Several models have accuracy degradation on Intel® Core™ Ultra 200V processors, Intel® Arc™ A-Series Graphics, and Intel® Arc™ B-Series Graphics. Please use OpenVINO 2024.6 to run the models. Model list: fastseg-small, hbonet-0.5, modnet_photographic_portrait_matting, modnet_webcam_portrait_matting, mobilenet-v3-small-1.0-224, nasnet-a-mobile-224, yolo_v4, yolo_v5m, yolo_v5s, yolo_v8n, yolox-tiny, yolact-resnet50-fpn-pytorch.

Deprecation And Support#

Using deprecated features and components is not advised. They are available to enable a smooth transition to new solutions and will be discontinued in the future. For more details, refer to:OpenVINO Legacy Features and Components.

Discontinued in 2025#

Deprecated and to be removed in the future#

You may not use or facilitate the use of this document in connection with any infringement or other legal analysis concerning Intel products described herein. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest Intel product specifications and roadmaps.

No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.

The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request.

Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Learn more atwww.intel.comor from the OEM or retailer.

No computer system can be absolutely secure.

Intel, Atom, Core, Xeon, OpenVINO, and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. Other names and brands may be claimed as the property of others.

Copyright © 2025, Intel Corporation. All rights reserved.

For more complete information about compiler optimizations, see our Optimization Notice.

Performance varies by use, configuration and other factors.