SKLearn — coremltools API Reference 8.3 documentation (original) (raw)
coremltools.converters.sklearn._converter.convert(sk_obj, input_features=None, output_feature_names=None)[source]
Convert scikit-learn pipeline, classifier, or regressor to Core ML format.
Parameters:
sk_obj: model | [model] of scikit-learn format.
Scikit learn model(s) to convert to a Core ML format.
The input model may be a single scikit learn model, a scikit learn pipeline model, or a list of scikit learn models.
Currently supported scikit learn models are:
- Linear and Logistic Regression
- LinearSVC and LinearSVR
- Ridge Regression
- SVC and SVR
- NuSVC and NuSVR
- Gradient Boosting Classifier and Regressor
- Decision Tree Classifier and Regressor
- Random Forest Classifier and Regressor
- Normalizer
- Imputer
- Standard Scaler
- DictVectorizer
- One Hot Encoder
- KNeighborsClassifier
The input model, or the last model in a pipeline or list of models, determines whether this is exposed as a Transformer, Regressor, or Classifier.
Note that there may not be a one-to-one correspondence between scikit learn models and the Core ML models chosen to represent them. For example, many scikit learn models are embedded in a pipeline to handle processing of input features.
input_features: str | dict | list
Optional name(s) that can be given to the inputs of the scikit-learn model. Defaults to "input"
.
Input features can be specified in a number of forms.
- Single string: In this case, the input is assumed to be a single array, with the number of dimensions set using
num_dimensions
. - List of strings: In this case, the overall input dimensions to the scikit-learn model are assumed to be the length of the list. If neighboring names are identical, they are assumed to be an input array of that length. For example:
["a", "b", "c"]
resolves to:[("a", Double), ("b", Double), ("c", Double)]
.
In addition:["a", "a", "b"]
resolves to:[("a", Array(2)), ("b", Double)]
. - Dictionary: Where the keys are the names and the indices or ranges of feature indices.
In this case, the Dictionary is presented as a mapping from keys to indices or ranges of contiguous indices. For example:
{"a" : 0, "b" : [2,3], "c" : 1}
resolves to:
[("a", Double), ("c", Double), ("b", Array(2))]
.Note that the ordering is determined by the indices.
- List of tuples of the form
(name, datatype)
, in whichname
is the name of the exposed feature, anddatatype
is an instance ofString
,Double
,Int64
,Array
, orDictionary
.
output_feature_names: string or list of strings
Optional name(s) that can be given to the inputs of the scikit-learn model.
The output_feature_names
is interpreted according to the model type:
- If the scikit-learn model is a transformer, it is the name of the array feature output by the final sequence of the transformer (defaults to
"output"
). - If it is a classifier, it should be a 2-tuple of names giving the top class prediction and the array of scores for each class (defaults to
"classLabel"
and"classScores"
). - If it is a regressor, it should give the name of the prediction value (defaults to
"prediction"
).
Returns:
model:MLModel
Returns an MLModel instance representing a Core ML model.
Examples
from sklearn.linear_model import LinearRegression import pandas as pd
Load data
data = pd.read_csv('houses.csv')
Train a model
model = LinearRegression() model.fit(data[["bedroom", "bath", "size"]], data["price"])
Convert and save the scikit-learn model
import coremltools coreml_model = coremltools.converters.sklearn.convert(model, ["bedroom", "bath", "size"], "price") coreml_model.save('HousePricer.mlmodel')