GitHub - uxlfoundation/scikit-learn-intelex: Extension for Scikit-learn is a seamless way to speed up your Scikit-learn application (original) (raw)
Overview
Extension for Scikit-learn is a free software AI accelerator designed to deliver over 10-100X acceleration to your existing scikit-learn code. The software acceleration is achieved with vector instructions, AI hardware-specific memory optimizations, threading, and optimizations.
With Extension for Scikit-learn, you can:
- Speed up training and inference by up to 100x with equivalent mathematical accuracy
- Benefit from performance improvements across different CPU hardware configurations, including GPUs and multi-GPU configurations
- Integrate the extension into your existing Scikit-learn applications without code modifications
- Continue to use the open-source scikit-learn API
- Enable and disable the extension with a couple of lines of code or at the command line
Acceleration
Optimizations
Easiest way to benefit from accelerations from the extension is by patching scikit-learn with it:
- Enable CPU optimizations
import numpy as np
from sklearnex import patch_sklearn
patch_sklearn()
from sklearn.cluster import DBSCAN
X = np.array([[1., 2.], [2., 2.], [2., 3.],
[8., 7.], [8., 8.], [25., 80.]], dtype=np.float32)
clustering = DBSCAN(eps=3, min_samples=2).fit(X)
- Enable GPU optimizations
Note: executing on GPU has additional system software requirements - see details.
import numpy as np
from sklearnex import patch_sklearn, config_context
patch_sklearn()
from sklearn.cluster import DBSCAN
X = np.array([[1., 2.], [2., 2.], [2., 3.],
[8., 7.], [8., 8.], [25., 80.]], dtype=np.float32)
with config_context(target_offload="gpu:0"):
clustering = DBSCAN(eps=3, min_samples=2).fit(X)
👀 Check out available notebooks for more examples.
Usage without patching
Alternatively, all functionalities are also available under a separate module which can be imported directly, without involving any patching.
- To run on CPU:
import numpy as np
from sklearnex.cluster import DBSCAN
X = np.array([[1., 2.], [2., 2.], [2., 3.],
[8., 7.], [8., 8.], [25., 80.]], dtype=np.float32)
clustering = DBSCAN(eps=3, min_samples=2).fit(X)
- To run on GPU:
import numpy as np
from sklearnex import config_context
from sklearnex.cluster import DBSCAN
X = np.array([[1., 2.], [2., 2.], [2., 3.],
[8., 7.], [8., 8.], [25., 80.]], dtype=np.float32)
with config_context(target_offload="gpu:0"):
clustering = DBSCAN(eps=3, min_samples=2).fit(X)
Installation
To install Extension for Scikit-learn, run:
pip install scikit-learn-intelex
Package is also offered through other channels such as conda-forge. See all installation instructions in the Installation Guide.
Integration
The easiest way of accelerating scikit-learn workflows with the extension is through through patching, which replaces the stock scikit-learn algorithms with their optimized versions provided by the extension using the same namespaces in the same modules as scikit-learn.
The patching only affects supported algorithms and their parameters. You can still use not supported ones in your code, the package simply fallbacks into the stock version of scikit-learn.
TIP: Enable verbose mode to see which implementation of the algorithm is currently used.
To patch scikit-learn, you can:
- Use the following command-line flag:
python -m sklearnex my_application.py - Add the following lines to the script:
from sklearnex import patch_sklearn
patch_sklearn()
👀 Read about other ways to patch scikit-learn.
As an alternative, accelerated classes from the extension can also be imported directly without patching, thereby allowing to keep them separate from stock scikit-learn ones - for example:
from sklearnex.cluster import DBSCAN as exDBSCAN from sklearn.cluster import DBSCAN as stockDBSCAN
...
Documentation
Extension and oneDAL
Acceleration in patched scikit-learn classes is achieved by replacing calls to scikit-learn with calls to oneDAL (oneAPI Data Analytics Library) behind the scenes:
Samples & Examples
How to Contribute
We welcome community contributions, check our Contributing Guidelines to learn more.
* The Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.