oneAPI and GPU support in Extension for Scikit-learn* — Extension for Scikit-learn* 2025.4 documentation (original) (raw)

Extension for Scikit-learn* can execute computations on different devices (CPUs, GPUs) through the SYCL framework in oneAPI.

The device used for computations can be easily controlled through the target offloading functionality (e.g. through sklearnex.config_context(target_offload="gpu") - see rest of this page for more details), but for finer-grained controlled (e.g. operating on arrays that are already in a given device’s memory), it can also interact with objects from package dpctl, which offers a Python interface over SYCL concepts such as devices, queues, and USM (unified shared memory) arrays.

While not strictly required, package dpctl is recommended for a better experience on GPUs.

Important

Be aware that GPU usage requires non-Python dependencies on your system, such as the Intel(R) GPGPU Drivers.

Prerequisites

For execution on GPUs, DPC++ runtime and GPGPU drivers are required.

DPC++ compiler runtime can be installed either from PyPI or Conda:

Install from PyPI:
Install using Conda from Intel’s repository:
conda install -c https://software.repos.intel.com/python/conda/ dpcpp_cpp_rt
Install using Conda from the conda-forge channel:
conda install -c conda-forge dpcpp_cpp_rt

For GPGPU driver installation instructions, see the general DPC++ system requirements sections corresponding to your operating system.

Device offloading

Extension for Scikit-learn* offers two options for running an algorithm on a specified device:

Use global configurations of Extension for Scikit-learn**:
1. The target_offload argument (in config_context and in set_config / get_config) can be used to set the device primarily used to perform computations. Accepted data types arestr and dpctl.SyclQueue. Strings must match to device names recognized by the SYCL* device filter selector - for example, "gpu". If passing "auto", the device will be deduced from the location of the input data. Examples:
  from sklearnex import config_context
  from sklearnex.linear_model import LinearRegression
  with config_context(target_offload="gpu"):
  model = LinearRegression().fit(X, y)
  from sklearnex import set_config
  from sklearnex.linear_model import LinearRegression
  set_config(target_offload="gpu")
  model = LinearRegression().fit(X, y)
  If passing a string different than "auto", it must be a device
2. The allow_fallback_to_host argument in those same configuration functions is a Boolean flag. If set to True, the computation is allowed to fallback to the host device when a particular estimator does not support the selected device. The default value is False.

These options can be set using sklearnex.set_config() function orsklearnex.config_context. To obtain the current values of these options, call sklearnex.get_config().

Note

Functions set_config, get_config and config_contextare always patched after the sklearnex.patch_sklearn() call.

Pass input data as dpctl.tensor.usm_ndarray to the algorithm.
The computation will run on the device where the input data is located, and the result will be returned as usm_ndarray to the same device.
Note
All the input data for an algorithm must reside on the same device.
Warning
The usm_ndarray can only be consumed by the base methods like fit, predict, and transform. Note that only the algorithms in Extension for Scikit-learn* supportusm_ndarray. The algorithms from the stock version of scikit-learndo not support this feature.

Example

A full example of how to patch your code with Intel CPU/GPU optimizations:

from sklearnex import patch_sklearn, config_context patch_sklearn()

from sklearn.cluster import DBSCAN

X = np.array([[1., 2.], [2., 2.], [2., 3.], [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32) with config_context(target_offload="gpu:0"): clustering = DBSCAN(eps=3, min_samples=2).fit(X)

Note

Current offloading behavior restricts fitting and predictions (a.k.a. inference) of any models to be in the same context or absence of context. For example, a model whose .fit() method was called in a GPU context withtarget_offload="gpu:0" will throw an error if a .predict() call is then made outside the same GPU context.