Model APIs — coremltools API Reference 8.0b1 documentation (original) (raw)

MLModel

class coremltools.models.model.MLModel(model, is_temp_package=False, mil_program=None, skip_model_load=False, compute_units=ComputeUnit.ALL, weights_dir=None, function_name=None, optimization_hints: dict | None = None)[source]

This class defines the minimal interface to a Core ML object in Python.

At a high level, the protobuf specification consists of:

Model description: Encodes names and type information of the inputs and outputs to the model.
Model parameters: The set of parameters required to represent a specific instance of the model.
Metadata: Information about the origin, license, and author of the model.

With this class, you can inspect a Core ML model, modify metadata, and make predictions for the purposes of testing (on select platforms).

Examples

Load the model

model = MLModel("HousePricer.mlmodel")

Set the model metadata

model.author = "Author" model.license = "BSD" model.short_description = "Predicts the price of a house in the Seattle area."

Get the interface to the model

model.input_description model.output_description

Set feature descriptions manually

model.input_description["bedroom"] = "Number of bedrooms" model.input_description["bathrooms"] = "Number of bathrooms" model.input_description["size"] = "Size (in square feet)"

Set

model.output_description["price"] = "Price of the house"

Make predictions

predictions = model.predict({"bedroom": 1.0, "bath": 1.0, "size": 1240})

Get the spec of the model

spec = model.get_spec()

Save the model

model.save("HousePricer.mlpackage")

Load the model from the spec object

spec = model.get_spec()

modify spec (e.g. rename inputs/outputs etc)

model = MLModel(spec)

if model type is mlprogram, i.e. spec.WhichOneof('Type') == "mlProgram", then:

model = MLModel(spec, weights_dir=model.weights_dir)

Load a non-default function from a multifunction .mlpackage

model = MLModel("MultifunctionModel.mlpackage", function_name="deep_features")

__init__(model, is_temp_package=False, mil_program=None, skip_model_load=False, compute_units=ComputeUnit.ALL, weights_dir=None, function_name=None, optimization_hints: dict | None = None)[source]

Construct an MLModel from an .mlmodel.

Parameters:

model: str or Model_pb2

For an ML program (mlprogram), the model can be a path string (.mlpackage) or Model_pb2. If it is a path string, it must point to a directory containing bundle artifacts (such as weights.bin). If it is of type Model_pb2 (spec), then you must also provide weights_dir if the model has weights, because both the proto spec and the weights are required to initialize and load the model. The proto spec for an mlprogram, unlike a neural network (neuralnetwork), does not contain the weights; they are stored separately. If the model does not have weights, you can provide an empty weights_dir.

For non- mlprogram model types, the model can be a path string (.mlmodel) or type Model_pb2, such as a spec object.

is_temp_package: bool

Set to True if the input model package dir is temporary and can be deleted upon interpreter termination.

mil_program: coremltools.converters.mil.Program

Set to the MIL program object, if available. It is available whenever an MLModel object is constructed using the unified converter API coremltools.convert().

skip_model_load: bool

Set to True to prevent Core ML Tools from calling into the Core ML framework to compile and load the model. In that case, the returned model object cannot be used to make a prediction. This flag may be used to load a newer model type on an older Mac, to inspect or load/save the spec.

Example: Loading an ML program model type on a macOS 11, since an ML program can be compiled and loaded only from macOS12+.

Defaults to False.

compute_units: coremltools.ComputeUnit

The set of processing units the model can use to make predictions.

An enum with four possible values:

coremltools.ComputeUnit.ALL: Use all compute units available, including the neural engine.
coremltools.ComputeUnit.CPU_ONLY: Limit the model to only use the CPU.
coremltools.ComputeUnit.CPU_AND_GPU: Use both the CPU and GPU, but not the neural engine.
coremltools.ComputeUnit.CPU_AND_NE: Use both the CPU and neural engine, but not the GPU. Available only for macOS >= 13.0.

weights_dir: str

Path to the weight directory, required when loading an MLModel of type mlprogram, from a spec object, such as when the argument model is of type Model_pb2.

function_namestr

The name of the function from model to load. If not provided, function_name will be set to the defaultFunctionName in the proto.

optimization_hintsdict or None

Keys are the names of the optimization hint, either ‘reshapeFrequency’ or ‘specializationStrategy’. Values are enumeration values of type coremltools.ReshapeFrequency or coremltools.SpecializationStrategy.

Notes

Internally this maintains the following:

_MLModelProxy: A pybind wrapper around CoreML::Python::Model (seecoremltools/coremlpython/CoreMLPython.mm)
package_path (mlprogram only): Directory containing all artifacts (.mlmodel, weights, and so on).
weights_dir (mlprogram only): Directory containing weights inside the package_path.

Examples

loaded_model = MLModel("my_model.mlmodel") loaded_model = MLModel("my_model.mlpackage")

get_compiled_model_path()[source]

Returns the path for the underlying compiled ML Model.

Important: This path is available only for the lifetime of this Python object. If you want the compiled model to persist, you need to make a copy.

get_spec()[source]

Get a deep copy of the protobuf specification of the model.

Returns:

model: Model_pb2

Protobuf specification of the model.

Examples

make_state() → MLState[source]

Returns a new state object, which can be passed to the predict method.

Returns:

state (MLState) – Holds state for an MLModel.
State functionality is only supported on macOS 15+.

Examples

state = model.make_state() predictions = model.predict(x, state)

predict(data, state: MLState | None = None)[source]

Return predictions for the model.

Parameters:

data: dict[str, value] or list[dict[str, value]]

Dictionary of data to use for predictions, where the keys are the names of the input features. For batch predictons, use a list of such dictionaries.

The following dictionary values types are acceptable: list, array, numpy.ndarray, tensorflow.Tensor and torch.Tensor.

stateMLState

Optional state object as returned by make_state().

Returns:

dict[str, value]

Predictions as a dictionary where each key is the output feature name.

list[dict[str, value]]

For batch prediction, returns a list of the above dictionaries.

Examples

data = {"bedroom": 1.0, "bath": 1.0, "size": 1240} predictions = model.predict(data)

data = [ {"bedroom": 1.0, "bath": 1.0, "size": 1240}, {"bedroom": 4.0, "bath": 2.5, "size": 2400}, ] batch_predictions = model.predict(data)

save(save_path: str)[source]

Save the model to an .mlmodel format. For an MIL program, the save_path is a package directory containing the mlmodel and weights.

Parameters:

save_path: Target file path / bundle directory for the model.

Examples

model.save("my_model_file.mlmodel") loaded_model = MLModel("my_model_file.mlmodel")

Compiled MLModel

class coremltools.models.CompiledMLModel(path: str, compute_units: ComputeUnit = ComputeUnit.ALL, function_name: str | None = None, optimization_hints: dict | None = None)[source]

__init__(path: str, compute_units: ComputeUnit = ComputeUnit.ALL, function_name: str | None = None, optimization_hints: dict | None = None)[source]

Loads a compiled Core ML model.

Parameters:

pathstr

The path to a compiled model directory, ending in .mlmodelc.

compute_unitscoremltools.ComputeUnit

An enum with the following possible values:

coremltools.ComputeUnit.ALL: Use all compute units available, including the neural engine.
coremltools.ComputeUnit.CPU_ONLY: Limit the model to only use the CPU.
coremltools.ComputeUnit.CPU_AND_GPU: Use both the CPU and GPU, but not the neural engine.
coremltools.ComputeUnit.CPU_AND_NE: Use both the CPU and neural engine, but not the GPU. Available only for macOS >= 13.0.

optimization_hintsdict or None

Examples

my_compiled_model = ct.models.CompiledMLModel("my_model_path.mlmodelc") y = my_compiled_model.predict({"x": 3})

make_state() → MLState[source]

Returns a new state object, which can be passed to the predict method.

Examples

state = model.make_state() predictions = model.predict(x, state)

predict(data, state: MLState | None = None)[source]

Return predictions for the model.

Parameters:

data: dict[str, value] or list[dict[str, value]]

Dictionary of data to use for predictions, where the keys are the names of the input features. For batch predictons, use a list of such dictionaries.

stateMLState

Optional state object as returned by make_state().

Returns:

dict[str, value]

Predictions as a dictionary where each key is the output feature name.

list[dict[str, value]]

For batch prediction, returns a list of the above dictionaries.

Examples

data = {"bedroom": 1.0, "bath": 1.0, "size": 1240} predictions = model.predict(data)

data = [ {"bedroom": 1.0, "bath": 1.0, "size": 1240}, {"bedroom": 4.0, "bath": 2.5, "size": 2400}, ] batch_predictions = model.predict(data)

compression_utils

extract_submodel

This utility function lets you extract a submodel from a Core ML model.

For a neural network model, the function extracts only in-memory Core ML models. You should always call this function for a model directly from convert. It is not allowed to load the model from disk and then call this API.

For an ML program model, both cases (in-memory and from disk) are supported.

Parameters:

model: MLModel

The Core ML model from which the submodel is extracted.

outputs: list[str]

A list of names of Vars, which are the outputs of the extracted submodel.

inputs: list[str] (Optional)

A list of names of Vars, which are the inputs of the extracted submodel. If not provided, the inputs from the original model are used.

function_name: str (Optional)

Name of the function where the subgraph is extracted. Default is main.

Examples

Neural network:

from coremltools.converters.mil.debugging_utils import extract_submodel mlmodel = ct.convert(model, convert_to="neuralnetwork") outputs = ["output_0", "output_1"] submodel = extract_submodel(mlmodel, outputs)

ML program:

from coremltools.converters.mil.debugging_utils import extract_submodel mlmodel = ct.convert(model, convert_to="mlprogram") outputs = ["output_0", "output_1"]

Directly extract model in memory
submodel = extract_submodel(mlmodel, outputs)

Extract model loaded from disk
mlmodel.save("model.mlpackage") mlmodel = coremltools.model.models.MLModel("model.mlpackage") submodel = extract_submodel(mlmodel, outputs)

feature_vectorizer

coremltools.models.feature_vectorizer.create_feature_vectorizer(input_features, output_feature_name, known_size_map={})[source]

Create a feature vectorizer from input features. This returns a 2-tuple(spec, num_dimension) for a feature vectorizer that puts everything into a single array with a length equal to the total size of all the input features.

Parameters:

input_features: [list of 2-tuples]

Name(s) of the input features, given as a list of ('name', datatype)tuples. The datatypes entry is one of the data types defined in thedatatypes module. Allowed datatypes are datatype.Int64,datatype.Double, datatypes.Dictionary, and datatype.Array.

If the feature is a dictionary type, then the dictionary must have integer keys, and the number of dimensions to expand it into must be provided byknown_size_map.

Feature indices in the final array are counted sequentially from the from 0 through the total number of features.

output_feature_name: str

The name of the output feature. The type is an Array List of the output features of the network.

known_size_map:

A dictionary mapping the feature name to the expanded size in the final array. This is most useful for specifying the size of sparse vectors given as dictionaries of index to value.

nearest_neighbors

class coremltools.models.nearest_neighbors.builder.KNearestNeighborsClassifierBuilder(input_name, output_name, number_of_dimensions, default_class_label, **kwargs)[source]

Construct a CoreML KNearestNeighborsClassifier specification.

Please see the Core ML Nearest Neighbors protobuf message for more information on KNearestNeighborsClassifier parameters.

Examples

from coremltools.models.nearest_neighbors import KNearestNeighborsClassifierBuilder from coremltools.models.utils import save_spec

Create a KNearestNeighborsClassifier model that takes 4-dimensional input data and outputs a string label.

builder = KNearestNeighborsClassifierBuilder(input_name='input', ... output_name='output', ... number_of_dimensions=4, ... default_class_label='default_label')

save the spec by the builder

save_spec(builder.spec, 'knnclassifier.mlmodel')

__init__(input_name, output_name, number_of_dimensions, default_class_label, **kwargs)[source]

Create a KNearestNeighborsClassifierBuilder object.

Parameters:

input_name

Name of the model input.

output_name

Name of the output.

number_of_dimensions

Number of dimensions of the input data.

default_class_label

The default class label to use for predictions. Must be either an int64 or a string.

number_of_neighbors

Number of neighbors to use for predictions. Default = 5 with allowed values between 1-1000.

weighting_scheme

Weight function used in prediction. One of 'uniform' (default) or'inverse_distance'.

index_type

Algorithm to compute nearest neighbors. One of 'linear' (default), or'kd_tree'.

leaf_size

Leaf size for the kd-tree. Ignored if index type is 'linear'. Default = 30.

add_samples(data_points, labels)[source]

Add some samples to the KNearestNeighborsClassifier model.

Parameters:

data_points

List of input data points.

labels

List of corresponding labels.

Returns:

None

Get the author for the KNearestNeighborsClassifier model.

Returns:

The author

property description

Get the description for the KNearestNeighborsClassifier model.

Returns:

The description.

property index_type

Get the index type for the KNearestNeighborsClassifier model.

Returns:

The index type.

property is_updatable

Check if the KNearestNeighborsClassifier is updatable.

Returns:

Is updatable.

property leaf_size

Get the leaf size for the KNearestNeighborsClassifier.

Returns:

The leaf size.

property license

Get the author for the KNearestNeighborsClassifier model.

Returns:

The author

property number_of_dimensions

Get the number of dimensions of the input data for the KNearestNeighborsClassifier model.

Returns:

Number of dimensions.

property number_of_neighbors

Get the number of neighbors value for the KNearestNeighborsClassifier model.

Returns:

The number of neighbors default value.

number_of_neighbors_allowed_range()[source]

Get the range of allowed values for the numberOfNeighbors parameter.

Returns:

Tuple of (min_value, max_value) or None if the range hasn’t been set.

number_of_neighbors_allowed_set()[source]

Get the set of allowed values for the numberOfNeighbors parameter.

Returns:

Set of allowed values or None if the set of allowed values hasn’t been

populated.

set_index_type(index_type, leaf_size=30)[source]

Set the index type for the KNearestNeighborsClassifier model.

Parameters:

index_type

One of [ 'linear', 'kd_tree' ].

leaf_size

For kd_tree indexes, the leaf size to use (default = 30).

Returns:

None

set_number_of_neighbors_with_bounds(number_of_neighbors, allowed_range=None, allowed_set=None)[source]

Set the numberOfNeighbors parameter for the KNearestNeighborsClassifier model.

Parameters:

allowed_range

Tuple of (min_value, max_value) defining the range of allowed values.

allowed_values

Set of allowed values for the number of neighbors.

Returns:

None

property weighting_scheme

Get the weighting scheme for the KNearestNeighborsClassifier model.

Returns:

The weighting scheme.

neural_network

pipeline

Pipeline utils for this package.

class coremltools.models.pipeline.Pipeline(input_features, output_features, training_features=None)[source]

A pipeline model that exposes a sequence of models as a single model, It requires a set of inputs, a sequence of other models and a set of outputs.

This class is the base class for PipelineClassifier andPipelineRegressor, which contain a sequence ending in a classifier or regressor and themselves behave like a classifier or regressor. This class may be used directly for a sequence of feature transformer objects.

__init__(input_features, output_features, training_features=None)[source]

Create a pipeline of models to be executed sequentially.

Parameters:

input_features: [list of 2-tuples]

Name(s) of the input features, given as a list of (‘name’, datatype)tuples. The datatypes entry can be any of the data types defined in themodels.datatypes module.

output_features: [list of features]

Name(s) of the output features, given as a list of(‘name’,datatype) tuples. The datatypes entry can be any of the data types defined in the models.datatypes module. All features must be either defined in the inputs or be produced by one of the contained models.

add_model(spec)[source]

Add a protobuf spec or models.MLModel instance to the pipeline.

All input features of this model must either match the input_features of the pipeline, or match the outputs of a previous model.

Parameters:

spec: [MLModel, Model_pb2]

A protobuf spec or MLModel instance containing a model.

set_training_input(training_input)[source]

Set the training inputs of the network spec.

Parameters:

training_input: [tuple]

List of training input names and type of the network.

class coremltools.models.pipeline.PipelineClassifier(input_features, class_labels, output_features=None, training_features=None)[source]

A pipeline model that exposes a sequence of models as a single model, It requires a set of inputs, a sequence of other models and a set of outputs. In this case the pipeline itself behaves as a classification model by designating a discrete categorical output feature as its ‘predicted feature’.

__init__(input_features, class_labels, output_features=None, training_features=None)[source]

Create a set of pipeline models given a set of model specs. The last model in this list must be a classifier model.

Parameters:

input_features: [list of 2-tuples]

Name(s) of the input features, given as a list of (‘name’, datatype)tuples. The datatypes entry can be any of the data types defined in themodels.datatypes module.

class_labels: [list]

A list of string or integer class labels to use in making predictions. This list must match the class labels in the model outputting the categorical predictedFeatureName

output_features: [list]

A string or a list of two strings specifying the names of the two output features, the first being a class label corresponding to the class with the highest predicted score, and the second being a dictionary mapping each class to its score. If output_featuresis a string, it specifies the predicted class label and the class scores is set to the default value of “classProbability.”

add_model(spec)[source]

Add a protobuf spec or models.MLModel instance to the pipeline.

All input features of this model must either match the input_features of the pipeline, or match the outputs of a previous model.

Parameters:

spec: [MLModel, Model_pb2]

A protobuf spec or MLModel instance containing a model.

set_training_input(training_input)[source]

Set the training inputs of the network spec.

Parameters:

training_input: [tuple]

List of training input names and type of the network.

class coremltools.models.pipeline.PipelineRegressor(input_features, output_features, training_features=None)[source]

A pipeline model that exposes a sequence of models as a single model, It requires a set of inputs, a sequence of other models and a set of outputs. In this case the pipeline itself behaves as a regression model by designating a real valued output feature as its ‘predicted feature’.

__init__(input_features, output_features, training_features=None)[source]

Create a set of pipeline models given a set of model specs. The final output model must be a regression model.

Parameters:

input_features: [list of 2-tuples]

Name(s) of the input features, given as a list of (‘name’, datatype)tuples. The datatypes entry can be any of the data types defined in themodels.datatypes module.

output_features: [list of features]

add_model(spec)[source]

Add a protobuf spec or models.MLModel instance to the pipeline.

All input features of this model must either match the input_features of the pipeline, or match the outputs of a previous model.

Parameters:

spec: [MLModel, Model_pb2]

A protobuf spec or MLModel instance containing a model.

set_training_input(training_input)[source]

Set the training inputs of the network spec.

Parameters:

training_input: [tuple]

List of training input names and type of the network.

tree_ensemble

Tree ensemble builder class to construct CoreML models.

class coremltools.models.tree_ensemble.TreeEnsembleBase[source]

Base class for the tree ensemble builder class. This should be instantiated either through the TreeEnsembleRegressor orTreeEnsembleClassifier classes.

__init__()[source]

High level Python API to build a tree ensemble model for Core ML.

add_branch_node(tree_id, node_id, feature_index, feature_value, branch_mode, true_child_id, false_child_id, relative_hit_rate=None, missing_value_tracks_true_child=False)[source]

Add a branch node to the tree ensemble.

Parameters:

tree_id: int

ID of the tree to add the node to.

node_id: int

ID of the node within the tree.

feature_index: int

Index of the feature in the input being split on.

feature_value: double or int

The value used in the feature comparison determining the traversal direction from this node.

branch_mode: str

Branch mode of the node, specifying the condition under which the node referenced by true_child_id is called next.

Must be one of the following:

"BranchOnValueLessThanEqual". Traverse to node true_child_idif input[feature_index] <= feature_value, and false_child_idotherwise.

"BranchOnValueLessThan". Traverse to node true_child_idif input[feature_index] < feature_value, and false_child_idotherwise.

"BranchOnValueGreaterThanEqual". Traverse to node true_child_idif input[feature_index] >= feature_value, and false_child_idotherwise.

"BranchOnValueGreaterThan". Traverse to node true_child_idif input[feature_index] > feature_value, and false_child_idotherwise.

"BranchOnValueEqual". Traverse to node true_child_idif input[feature_index] == feature_value, and false_child_idotherwise.

"BranchOnValueNotEqual". Traverse to node true_child_idif input[feature_index] != feature_value, and false_child_idotherwise.

true_child_id: int

ID of the child under the true condition of the split. An error will be raised at model validation if this does not match the node_idof a node instantiated by add_branch_node or add_leaf_node within this tree_id.

false_child_id: int

ID of the child under the false condition of the split. An error will be raised at model validation if this does not match the node_idof a node instantiated by add_branch_node or add_leaf_node within this tree_id.

relative_hit_rate: float [optional]

When the model is converted compiled by CoreML, this gives hints to Core ML about which node is more likely to be hit on evaluation, allowing for additional optimizations. The values can be on any scale, with the values between child nodes being compared relative to each other.

missing_value_tracks_true_child: bool [optional]

If the training data contains NaN values or missing values, then this flag determines which direction a NaN value traverses.

add_leaf_node(tree_id, node_id, values, relative_hit_rate=None)[source]

Add a leaf node to the tree ensemble.

Parameters:

tree_id: int

ID of the tree to add the node to.

node_id: int

ID of the node within the tree.

values: [float | int | list | dict]

Value(s) at the leaf node to add to the prediction when this node is activated. If the prediction dimension of the tree is 1, then the value is specified as a float or integer value.

For multidimensional predictions, the values can be a list of numbers with length matching the dimension of the predictions or a dictionary mapping index to value added to that dimension.

Note that the dimension of any tree must match the dimension given when set_default_prediction_value() is called.

set_default_prediction_value(values)[source]

Set the default prediction value(s).

The values given here form the base prediction value that the values at activated leaves are added to. If values is a scalar, then the output of the tree must also be 1 dimensional; otherwise, values must be a list with length matching the dimension of values in the tree.

Parameters:

values: [int | double | list[double]]

Default values for predictions.

set_post_evaluation_transform(value)[source]

Set the post processing transform applied after the prediction value from the tree ensemble.

Parameters:

value: str

A value denoting the transform applied. Possible values are:

"NoTransform" (default). Do not apply a transform.
"Classification_SoftMax".
Apply a softmax function to the outcome to produce normalized, non-negative scores that sum to 1. The transformation applied to dimension i is equivalent to:

\[\frac{e^{x_i}}{\sum_j e^{x_j}}\]
Note: This is the output transformation applied by the XGBoost package with multiclass classification.
"Regression_Logistic".
Applies a logistic transform the predicted value, specifically:

This is the transformation used in binary classification.

class coremltools.models.tree_ensemble.TreeEnsembleClassifier(features, class_labels, output_features)[source]

Tree Ensemble builder class to construct a Tree Ensemble classification model.

The TreeEnsembleClassifier class constructs a Tree Ensemble model incrementally using methods to add branch and leaf nodes specifying the behavior of the model.

Examples

In the following example, the code saves the model to disk, which is a recommended practice but not required.

input_features = [("a", datatypes.Array(3)), ("b", datatypes.Double())]

tm = TreeEnsembleClassifier(features = input_features, class_labels = [0, 1], output_features = "predicted_class")

Split on a[2] <= 3
tm.add_branch_node(0, 0, 2, 3, "BranchOnValueLessThanEqual", 1, 2)

Add leaf to the true branch of node 0 that subtracts 1.
tm.add_leaf_node(0, 1, -1)

Add split on b == 0 to the false branch of node 0.
tm.add_branch_node(0, 2, 3, 0, "BranchOnValueEqual", 3, 4)

Add leaf to the true branch of node 2 that adds 1 to the result.
tm.add_leaf_node(0, 3, 1)

Add leaf to the false branch of node 2 that subtracts 1 from the result.
tm.add_leaf_node(0, 4, -1)

Put in a softmax transform to translate these into probabilities.
tm.set_post_evaluation_transform("Classification_SoftMax")

tm.set_default_prediction_value([0, 0])

save the model to a .mlmodel file
model_path = './tree.mlmodel' coremltools.models.utils.save_spec(tm.spec, model_path)

load the .mlmodel
mlmodel = coremltools.models.MLModel(model_path)

make predictions
test_input = { 'a': np.array([0, 1, 2]).astype(np.float32), "b": 3.0, } predictions = mlmodel.predict(test_input)

__init__(features, class_labels, output_features)[source]

Create a tree ensemble classifier model.

Parameters:

features: [list of features]

Name(s) of the input features, given as a list of ('name', datatype)tuples. The features are one of models.datatypes.Int64,datatypes.Double, or models.datatypes.Array. Feature indices in the nodes are counted sequentially from 0 through the features.

class_labels: [list]

A list of string or integer class labels to use in making predictions. The length of this must match the dimension of the tree model.

output_features: [list]

A string or a list of two strings specifying the names of the two output features, the first being a class label corresponding to the class with the highest predicted score, and the second being a dictionary mapping each class to its score. If output_featuresis a string, it specifies the predicted class label and the class scores is set to the default value of "classProbability".

class coremltools.models.tree_ensemble.TreeEnsembleRegressor(features, target)[source]

Tree Ensemble builder class to construct a Tree Ensemble regression model.

The TreeEnsembleRegressor class constructs a Tree Ensemble model incrementally using methods to add branch and leaf nodes specifying the behavior of the model.

Examples

In the following example, the code saves the model to disk, which is a recommended practice but not required.

Required inputs
import coremltools from coremltools.models import datatypes from coremltools.models.tree_ensemble import TreeEnsembleRegressor import numpy as np

Define input features
input_features = [("a", datatypes.Array(3)), ("b", (datatypes.Double()))]

Define output_features
output_features = [("predicted_values", datatypes.Double())]

tm = TreeEnsembleRegressor(features = input_features, target = output_features)

Split on a[2] <= 3
tm.add_branch_node(0, 0, 2, 3, "BranchOnValueLessThanEqual", 1, 2)

Add leaf to the true branch of node 0 that subtracts 1.
tm.add_leaf_node(0, 1, -1)

Add split on b == 0 to the false branch of node 0, which is index 3
tm.add_branch_node(0, 2, 3, 0, "BranchOnValueEqual", 3, 4)

Add leaf to the true branch of node 2 that adds 1 to the result.
tm.add_leaf_node(0, 3, 1)

Add leaf to the false branch of node 2 that subtracts 1 from the result.
tm.add_leaf_node(0, 4, -1)

tm.set_default_prediction_value([0, 0])

save the model to a .mlmodel file
model_path = './tree.mlmodel' coremltools.models.utils.save_spec(tm.spec, model_path)

load the .mlmodel
mlmodel = coremltools.models.MLModel(model_path)

make predictions
test_input = { 'a': np.array([0, 1, 2]).astype(np.float32), "b": 3.0, } predictions = mlmodel.predict(test_input)

__init__(features, target)[source]

Create a Tree Ensemble regression model that takes one or more input features and maps them to an output feature.

Parameters:

features: [list of features]

target: (default = None)

Name of the target feature predicted.

utils

Utilities for the entire package.

class coremltools.models.utils.MultiFunctionDescriptor(model_path: str | None = None)[source]

This data class defines how to construct a multifunction model from different model sources. Use the add_function method to specify the path to the source mlpackage, along with the source and target function names.

After setting the default_function_name to the MultiFunctionDescriptor instance, you can export a multifunction model using the save_multifunction method.

Initialize a MultiFunctionDescriptor instance with functions in an existing mlpackage.

desc will contain all functions in "my_model.mlpackage"

desc = MultiFunctionDescriptor("my_model.mlpackage")

Construct a MultiFunctionDescriptor instance from scratch.

The below code inserts the "main" function from "my_model.mlpackage" as "main_1",

and inserts the "main" function from "my_model_2.mlpackage" as "main_2".

desc = MultiFunctionDescriptor() desc.add_function( model_path="my_model.mlpackage", source_function_name="main", target_function_name="main_1", ) desc.add_function( model_path="my_model_2.mlpackage", source_function_name="main", target_function_name="main_2", )

Each MultiFunctionDescriptor instance must have a default function name

so it can be saved as a multifunction mlpackage on disk.

desc.default_function_name = "main_1" save_multifunction(desc, "my_multifunction_model.mlpackage")

__init__(model_path: str | None = None)[source]

If model_path is passed to the constructor, it must be a str pointing to an existing mlpackage on disk. The MultiFunctionDescriptor instance will be initiated with the functions in model_path.

add_function(model_path: str, src_function_name: str, target_function_name: str) → None[source]

Insert a src_function_name function from model_path as thetarget_function_name function in the multifunction descriptor.

add_model(model_path: str) → None[source]

Insert all functions from the model in model_path into the multifunction descriptor. The function names will remain the same as in the original model.

remove_function(function_name: str) → None[source]

Remove a function function_name from the multifunction descriptor.

coremltools.models.utils.bisect_model(model: str | MLModel, output_dir: str, merge_chunks_to_pipeline: bool | None = False, check_output_correctness: bool | None = True)[source]

Utility function to split a mlpackage model into two mlpackages of approximately same file size.

Parameters:

model: str or MLModel

Path to the mlpackage file, or a Core ML model, to be split into two mlpackages of approximately same file size.

output_dir: str

Path to output directory where the two model chunks / pipeline model would be saved.

If the model is {path}/{model_name}.mlpackage, the chunk models are going to be saved as: 1. first chunk model: {output_dir}/{model_name}_chunk1.mlpackage2. second chunk model: {output_dir}/{model_name}_chunk2.mlpackage3. chunked pipeline model: {output_dir}/{model_name}_chunked_pipeline.mlpackage

If the model is type of MLModel, the chunk models are saved as: 1. first chunk model: {output_dir}/chunk1.mlpackage2. second chunk model: {output_dir}/chunk2.mlpackage3. chunked pipeline model: {output_dir}/chunked_pipeline.mlpackage

merge_chunks_to_pipeline: bool

If True, model chunks are managed inside a single pipeline model for easier asset maintenance.

check_output_correctness: bool

If True, compares the outputs of original Core ML model with that of pipelined CoreML model chunks and reports PSNR in dB.
Enabling this feature uses more memory. Disable it if your machine runs out of memory.

Examples

import coremltools as ct

model_path = "my_model.mlpackage" output_dir = "./output/"

The following code will produce two smaller models:

`./output/my_model_chunk1.mlpackage` and `./output/my_model_chunk2.mlpackage`

It also compares the output numerical of the original Core ML model with the chunked models.

ct.models.utils.bisect_model( model_path, output_dir, )

The following code will produce a single pipeline model `./output/my_model_chunked_pipeline.mlpackage`

ct.models.utils.bisect_model( model_path, output_dir, merge_chunks_to_pipeline=True, )

You can also pass the MLModel object directly

mlmodel = ct.models.MLModel(model_path) ct.models.utils.bisect_model( mlmodel, output_dir, merge_chunks_to_pipeline=True, )

coremltools.models.utils.compile_model(model: Model, destination_path: str | None = None) → str[source]

Compiles a Core ML model spec.

Parameters:

model: Model_pb2

Spec/protobuf to compile.

Note: an mlprogam which uses a blob file is not supported.

destination_path: str

Path where the compiled model will be saved.

Returns:

strPath to compiled model directory

If the destination_path is specified, that is the value that will be returned.

Examples

from coremltools.models import CompiledMLModel from coremltools.models.utils import compile_model from coremltools.proto import Model_pb2

spec = Model_pb2.Model() spec.specificationVersion = 1

input_ = spec.description.input.add() input_.name = "x" input_.type.doubleType.MergeFromString(b"")

output_ = spec.description.output.add() output_.name = "y" output_.type.doubleType.MergeFromString(b"") spec.description.predictedFeatureName = "y"

lr = spec.glmRegressor lr.offset.append(0.1) weights = lr.weights.add() weights.value.append(2.0)

compiled_model_path = compile_model(spec) model = CompiledMLModel(compiled_model_path) y = model.predict({"x": 2})

coremltools.models.utils.convert_double_to_float_multiarray_type(spec)[source]

Convert all double multiarrays feature descriptions (input, output, training input) to float multiarrays.

Parameters:

spec: Model_pb

The specification containing the multiarrays types to convert.

Examples

In-place convert multiarray type of spec

spec = mlmodel.get_spec() coremltools.utils.convert_double_to_float_multiarray_type(spec) model = coremltools.models.MLModel(spec)

coremltools.models.utils.evaluate_classifier(model, data, target='target', verbose=False)[source]

Evaluate a Core ML classifier model and compare against predictions from the original framework (for testing correctness of conversion). Use this evaluation for models that don’t deal with probabilities.

Parameters:

filename: list of str or list of MLModel

File to load the model from, or a loaded version of the MLModel.

data: list of str or list of Dataframe

Test data on which to evaluate the models (dataframe, or path to a CSV file).

target: str

Column to interpret as the target column.

verbose: bool

Set to true for more verbose output.

Examples

metrics = coremltools.utils.evaluate_classifier( spec, "data_and_predictions.csv", "target" ) print(metrics) {"samples": 10, num_errors: 0}

coremltools.models.utils.evaluate_classifier_with_probabilities(model, data, probabilities='probabilities', verbose=False)[source]

Evaluate a classifier specification for testing.

Parameters:

filename: [str | Model]

File to load the model from, or a loaded version of the MLModel.

data: [str | Dataframe]

Test data on which to evaluate the models (dataframe, or path to a CSV file).

probabilities: str

Column to interpret as the probabilities column.

verbose: bool

Verbosity levels of the predictions.

coremltools.models.utils.evaluate_regressor(model, data, target='target', verbose=False)[source]

Evaluate a Core ML regression model and compare against predictions from the original framework (for testing correctness of conversion).

Parameters:

model: MLModel or str

A loaded MLModel or a path to a saved MLModel.

data: Dataframe

Test data on which to evaluate the models.

target: str

Name of the column in the dataframe to be compared against the prediction.

verbose: bool

Set to true for a more verbose output.

Examples

metrics = coremltools.utils.evaluate_regressor( spec, "data_and_predictions.csv", "target" ) print(metrics) {"samples": 10, "rmse": 0.0, max_error: 0.0}

coremltools.models.utils.evaluate_transformer(model, input_data, reference_output, verbose=False)[source]

Evaluate a transformer specification for testing.

Parameters:

model: list of str or list of MLModel

File to load the Model from, or a loaded version of the MLModel.

input_data: list of dict

Test data on which to evaluate the models.

reference_output: list of dict

Expected results for the model.

verbose: bool

Verbosity levels of the predictions.

Examples

input_data = [{"input_1": 1, "input_2": 2}, {"input_1": 3, "input_2": 3}] expected_output = [{"input_1": 2.5, "input_2": 2.0}, {"input_1": 1.3, "input_2": 2.3}] metrics = coremltools.utils.evaluate_transformer( scaler_spec, input_data, expected_output )

coremltools.models.utils.load_spec(model_path: str) → <module 'coremltools.proto.Model_pb2' from '/Volumes/WorkData/Projects/CoreML/coremltools/coremltools/proto/Model_pb2.py'>[source]

Load a protobuf model specification from file (mlmodel) or directory (mlpackage).

Parameters:

model_path: Path to the model from which the protobuf spec is loaded.

Returns:

model_spec: Model_pb

Protobuf representation of the model.

Examples

spec = coremltools.utils.load_spec("HousePricer.mlmodel") spec = coremltools.utils.load_spec("HousePricer.mlpackage")

coremltools.models.utils.make_pipeline(*models: MLModel, compute_units: None | ComputeUnit = None) → MLModel [source]

Makes a pipeline with the given models.

Parameters:

*models

Two or more instances of ct.models.MLModel.

compute_units

The set of processing units that all models in the pipeline can use to make predictions. Can be None or coremltools.ComputeUnit.

If None, the compute_unit will be inferred from the compute_unit values of the models. If all models do not have the same compute_unit values, this parameter must be specified.
coremltools.ComputeUnit is an enum with four possible values:
- coremltools.ComputeUnit.ALL: Use all compute units available, including the neural engine.
- coremltools.ComputeUnit.CPU_ONLY: Limit the model to only use the CPU.
- coremltools.ComputeUnit.CPU_AND_GPU: Use both the CPU and GPU, but not the neural engine.
- coremltools.ComputeUnit.CPU_AND_NE: Use both the CPU and neural engine, but not the GPU. Available only for macOS >= 13.0.

Returns:

ct.models.MLModel

Examples

my_model1 = ct.models.MLModel("/tmp/m1.mlpackage") my_model2 = ct.models.MLModel("/tmp/m2.mlmodel")

my_pipeline_model = ct.utils.make_pipeline(my_model1, my_model2)

y = my_pipeline_model.predict({"x": 12})

my_pipeline_model.save("/tmp/my_pipeline.mlpackage") new_my_pipeline = ct.model.MLModel("/tmp/my_pipeline.mlpackage")

coremltools.models.utils.materialize_dynamic_shape_mlmodel(dynamic_shape_mlmodel: MLModel, function_name_to_materialization_map: Dict[str, Dict[str, Tuple[int]]], destination_path: str, source_function_name: str = 'main') → None[source]

Given a dynamic-shape mlmodel, materialize symbols to create fixed-shape functions, then save as an .mlpackage to destination path. To save memory, the pymil program of input dynamic-shape mlmodel is re-used. Constant deduplication across functions is performed to allow weight sharing.

Parameters:

dynamic_shape_mlmodelct.models.MLModel

A dynamic-shape mlmodel to be materialized

function_name_to_materialization_map: Dict[str, Dict[str, Tuple[int]]]

A dictionary specifying the name of new functions to be created, and for each new function what is the new fixed shapes for inputs. If a new function has the same name as an old function, then the old function will be overridden

destination_pathstr

The saved .mlpackage model path

source_function_name: str

The name of the source symbolic-shape function to be materialized, default = main

A dynamic-shape mlmodel you have converted

dynamic_shape_mlmodel: ct.models.MLModel

As an example, let us assume the inputs are

1. `input_ids (1, query_length)`

2. `mask (query_length, context_length)`

function_name_to_materialization_map = { "function_name_to_materialization_map": { "materialization_2_3": {"input_ids": (1, 2), "mask": (2, 3)}, "materialization_4_5": {"input_ids": (1, 4), "mask": (4, 5)}, } }

materialize_dynamic_shape_mlmodel( dynamic_shape_mlmodel, function_name_to_materialization_map, "materialized_model.mlpackage", )

To make prediction from the materialized mlmodel, load the desired materialized function

materialization_2_3 = ct.models.MLModel( "materialized_model.mlpackage", function_name="materialization_2_3" ) materialization_4_5 = ct.models.MLModel( "materialized_model.mlpackage", function_name="materialization_4_5" )

coremltools.models.utils.randomize_weights(mlmodel: MLModel)[source]

Utility function to randomize weights

Parameters:

mlmodel: MLModel

Model which will be randomized.

Returns:

model: MLModel

The MLModel with randomized weights.

Examples

import coremltools as ct

model = ct.models.MLModel("my_model.mlpackage") randomized_mlmodel = ct.models.utils.randomize_weights(mlmodel)

coremltools.models.utils.rename_feature(spec, current_name, new_name, rename_inputs=True, rename_outputs=True)[source]

Rename a feature in the specification.

Parameters:

spec: Model_pb

The specification containing the feature to rename.

current_name: str

Current name of the feature. If this feature doesn’t exist, the rename is a no-op.

new_name: str

New name of the feature.

rename_inputs: bool

Search for current_name only in the input features (that is, ignore output features).

rename_outputs: bool

Search for current_name only in the output features (that is, ignore input features).

Examples

In-place rename of spec

model = MLModel("model.mlmodel") spec = model.get_spec() coremltools.utils.rename_feature(spec, "old_feature", "new_feature_name")

re-initialize model

model = MLModel(spec) model.save("model.mlmodel")

Rename a spec when the model is an mlprogram, in that case, weights are stored outside of the spec

model = coremltools.convert(torch_model, convert_to="mlprogram") spec = model.get_spec()

print info about inputs and outputs

print(spec.description) coremltools.utils.rename_feature(spec, "old_feature", "new_feature_name")

re-initialize model

model = MLModel(spec, weights_dir=model.weights_dir) model.save("model.mlpackage")

coremltools.models.utils.save_multifunction(desc: MultiFunctionDescriptor, destination_path: str)[source]

Save a MultiFunctionDescriptor instance into a multifunction mlpackage. This function also performs constant deduplication across functions to allow for weight sharing.

Parameters:

desc: MultiFunctionDescriptor

Multifunction descriptor to save on the disk.

destination_path: str

The path where the new mlpackage will be saved.

Examples

from coremltools.utils import MultiFunctionDescriptor, save_multifunction

desc = MultiFunctionDescriptor("my_model_1.mlpackage") desc.add_function("my_model_2.mlpackage", "main", "main_2") desc.default_function_name = "main_2"

save_multifunction(desc, "multifunction_model.mlpackage")

coremltools.models.utils.save_spec(spec, filename, auto_set_specification_version=False, weights_dir=None)[source]

Save a protobuf model specification to file.

Parameters:

spec: Model_pb

Protobuf representation of the model.

filename: str

File path where the spec is saved.

auto_set_specification_version: bool

If True, will always try to set specification version automatically.

weights_dir: str

Path to the directory containing the weights.bin file. This is required when the spec has model type mlprogram. If the mlprogram does not contain any weights, this path can be an empty directory.

Examples

coremltools.utils.save_spec(spec, "HousePricer.mlmodel") coremltools.utils.save_spec(spec, "HousePricer.mlpackage") coremltools.utils.save_spec( spec, "mlprogram_model.mlpackage", weights_dir="/path/to/weights/directory" )

Model APIs — coremltools API Reference 8.0b1 documentation (original) (raw)

MLModel

Load the model

Set the model metadata

Get the interface to the model

Set feature descriptions manually

Set

Make predictions

Get the spec of the model

Save the model

Load the model from the spec object

modify spec (e.g. rename inputs/outputs etc)

if model type is mlprogram, i.e. spec.WhichOneof('Type') == "mlProgram", then:

Load a non-default function from a multifunction .mlpackage

Compiled MLModel

compression_utils

extract_submodel

Directly extract model in memory

Extract model loaded from disk

feature_vectorizer

nearest_neighbors

Create a KNearestNeighborsClassifier model that takes 4-dimensional input data and outputs a string label.

save the spec by the builder

neural_network

pipeline

tree_ensemble

Split on a[2] <= 3

Add leaf to the true branch of node 0 that subtracts 1.

Add split on b == 0 to the false branch of node 0.

Add leaf to the true branch of node 2 that adds 1 to the result.

Add leaf to the false branch of node 2 that subtracts 1 from the result.

Put in a softmax transform to translate these into probabilities.

save the model to a .mlmodel file

load the .mlmodel

make predictions

Required inputs

Define input features

Define output_features

Split on a[2] <= 3

Add leaf to the true branch of node 0 that subtracts 1.

Add split on b == 0 to the false branch of node 0, which is index 3

Add leaf to the true branch of node 2 that adds 1 to the result.

Add leaf to the false branch of node 2 that subtracts 1 from the result.

save the model to a .mlmodel file

load the .mlmodel

make predictions

utils

Initialize a MultiFunctionDescriptor instance with functions in an existing mlpackage.

desc will contain all functions in "my_model.mlpackage"

Construct a MultiFunctionDescriptor instance from scratch.

The below code inserts the "main" function from "my_model.mlpackage" as "main_1",

and inserts the "main" function from "my_model_2.mlpackage" as "main_2".

Each MultiFunctionDescriptor instance must have a default function name

so it can be saved as a multifunction mlpackage on disk.

The following code will produce two smaller models:

./output/my_model_chunk1.mlpackage and ./output/my_model_chunk2.mlpackage

It also compares the output numerical of the original Core ML model with the chunked models.

The following code will produce a single pipeline model ./output/my_model_chunked_pipeline.mlpackage

You can also pass the MLModel object directly

In-place convert multiarray type of spec

A dynamic-shape mlmodel you have converted

As an example, let us assume the inputs are

1. input_ids (1, query_length)

2. mask (query_length, context_length)

In-place rename of spec

re-initialize model

Rename a spec when the model is an mlprogram, in that case, weights are stored outside of the spec

print info about inputs and outputs

re-initialize model

`./output/my_model_chunk1.mlpackage` and `./output/my_model_chunk2.mlpackage`

The following code will produce a single pipeline model `./output/my_model_chunked_pipeline.mlpackage`

1. `input_ids (1, query_length)`

2. `mask (query_length, context_length)`