Unified (TensorFlow and Pytorch) — coremltools API Reference 8.0b1 documentation (original) (raw)
coremltools.converters._converters_entry.convert(model, source='auto', inputs=None, outputs=None, classifier_config=None, minimum_deployment_target=None, convert_to=None, compute_precision=None, skip_model_load=False, compute_units=ComputeUnit.ALL, package_dir=None, debug=False, pass_pipeline: PassPipeline | None = None, states=None)[source]
Convert a TensorFlow or PyTorch model to the Core ML model format as either a neural network or an ML program. Some parameters and requirements differ for TensorFlow and PyTorch conversions.
Parameters:
model
TensorFlow 1, TensorFlow 2, or PyTorch model in one of the following formats:
- TensorFlow versions 1.x
- Frozen tf.Graph
- Frozen graph (
.pb
) file path - tf.keras.Model
- HDF5 file path (
.h5
) - SavedModel directory path
- TensorFlow versions 2.x
- tf.keras.Model
- HDF5 file path (
.h5
) - SavedModel directory path
- A concrete function
- A GraphDef
- PyTorch
- TorchScript Models:
* A TorchScript object
- TorchScript Models:
* Path to a `.pt` file
- Torch Exported Models:
* An ExportedProgramobject withEDGE
dialect.
sourcestr (optional)
One of [auto
, tensorflow
, pytorch
, milinternal
]. auto
determines the framework automatically for most cases. RaisesValueError
if it fails to determine the source framework.
inputslist of TensorType
or ImageType
- If you specify
dtype
withTensorType
orImageType
, it will be applied to the input of the converted model. For example, the following code snippet will produce a Core ML model with float 16 typed inputs.
import coremltools as ct
mlmodel = ct.convert(
keras_model,
inputs=[ct.TensorType(dtype=np.float16)],
minimum_deployment_target=ct.target.macOS13,
) - The following code snippet will produce a Core ML model with the
GRAYSCALE_FLOAT16
input image type:
import coremltools as ct
H : image height, W: image width
mlmodel = ct.convert(
torch_model,
inputs=[
ct.ImageType(shape=(1, 1, H, W), color_layout=ct.colorlayout.GRAYSCALE_FLOAT16)
],
minimum_deployment_target=ct.target.macOS13,
)
- TensorFlow 1 and 2 (including tf.keras):
- The
inputs
parameter is optional. If not provided, the inputs are placeholder nodes in the model (if the model is a frozen graph) or function inputs (if the model is atf.function
). - If
inputs
is provided, it must be a flat list. - The
inputs
must correspond to all or some of the placeholder nodes in the TF model. - If
name
is specified withTensorType
andImageType
, it must correspond to a placeholder op in the TF graph. The input names in the converted Core ML model can later be modified using thect.utils.rename_feature
API. - If
dtype
is not specified, it defaults to thedtype
of the inputs in the TF model. - For
minimum_deployment_target >= ct.target.macOS13
, and withcompute_precision
in float 16 precision. Wheninputs
not provided ordtype
not specified, the float 32 inputs default to float 16.
- The
- PyTorch:
- TorchScript Models:
* Theinputs
parameter is required.
- TorchScript Models:
* Number of elements in `inputs` must match the number of inputs of the PyTorch model. * `inputs` may be a nested list or tuple. * `TensorType` and `ImageType` must have the `shape` specified. * If the `name` argument is specified with `TensorType` or`ImageType`, the converted Core ML model will have inputs with the same name. * If `dtype` is missing: * For `minimum_deployment_target <= ct.target.macOS12`, it defaults to float 32. * For `minimum_deployment_target >= ct.target.macOS13`, and with `compute_precision` in float 16 precision. It defaults to float 16.
- Torch Exported Models:
* Theinputs
parameter is not supported.
* Theinputs
parameter is inferred from the Torch ExportedProgram.
outputslist of TensorType
or ImageType
(optional)
- If you specify
dtype
withTensorType
orImageType
, it will be applied to the output of the converted model. For example, to produce float 16 typed inputs and outputs:
import coremltools as ct
mlmodel = ct.convert(
keras_model,
inputs=[ct.TensorType(dtype=np.float16)],
outputs=[ct.TensorType(dtype=np.float16)],
minimum_deployment_target=ct.target.macOS13,
) - To produce image inputs and outputs:
import coremltools as ct
H: image height, W: image width
mlmodel = ct.convert(
torch_model,
inputs=[ct.ImageType(shape=(1, 3, H, W), color_layout=ct.colorlayout.RGB)],
outputs=[ct.ImageType(color_layout=ct.colorlayout.RGB)],
minimum_deployment_target=ct.target.macOS13,
)
- TensorFlow 1 and 2 (including tf.keras):
- If
outputs
is not specified, the converter infers outputs from the sink nodes in the graph. - If specified, the
name
withTensorType
orImageType
must correspond to a node in the TF graph. In this case, the model will be converted up to that node. - For
minimum_deployment_target >= ct.target.macOS13
, and withcompute_precision
in float 16 precision. Ifdtype
not specified, the outputs inferred of type float 32 default to float 16.
- If
- PyTorch: TorchScript Models
- If specified, the length of the list must match the number of outputs returned by the PyTorch model.
- If
name
is specified, it is applied to the output names of the converted Core ML model. - For
minimum_deployment_target >= ct.target.macOS13
, and withcompute_precision
in float 16 precision. - If
dtype
not specified, the outputs inferred of type float 32 default to float 16.
- PyTorch: Torch Exported Models:
- The
outputs
parameter is not supported. - The
outputs
parameter is inferred from Torch ExportedProgram.
- The
classifier_configClassifierConfig class (optional)
The configuration if the MLModel is intended to be a classifier.
minimum_deployment_targetcoremltools.target enumeration (optional)
A member of the coremltools.target
enum. The value of this parameter determines the type of the model representation produced by the converter. To learn about the differences between ML programs and neural networks, seeML Programs.
- The converter produces a neural network (
neuralnetwork
) if:
minimum_deployment_target <= coremltools.target.iOS14/
coremltools.target.macOS11/
coremltools.target.watchOS7/
coremltools.target.tvOS14: - The converter produces an ML program (
mlprogram
) if:
minimum_deployment_target >= coremltools.target.iOS15/
coremltools.target.macOS12/
coremltools.target.watchOS8/
coremltools.target.tvOS15: - If neither the
minimum_deployment_target
nor theconvert_to
parameter is specified, the converter produces an ML program model type with as minimum of a deployment target as possible. - If this parameter is specified and
convert_to
is also specified, they must be compatible. The following are examples of invalid values:
Invalid:
convert_to="mlprogram", minimum_deployment_target=coremltools.target.iOS14
Invalid:
convert_to="neuralnetwork", minimum_deployment_target=coremltools.target.iOS15
convert_tostr (optional)
Must be one of ['mlprogram'
, 'neuralnetwork'
, 'milinternal'
]. The value of this parameter determines the type of the model representation produced by the converter. To learn about the differences between ML programs and neural networks, seeML Programs.
'mlprogram'
: Returns an MLModel (coremltools.models.MLModel
) containing a MILSpec.Program proto, which is the Core ML program format. The model saved from this returned object is executable on iOS15, macOS12, watchOS8, and tvOS15.'neuralnetwork'
: Returns an MLModel (coremltools.models.MLModel
) containing a NeuralNetwork proto, which is the original Core ML format. The model saved from this returned object is executable either on iOS13/macOS10.15/watchOS6/tvOS13 and newer, or on iOS14/macOS11/watchOS7/tvOS14 and newer, depending on the layers used in the model.'milinternal'
: Returns an MIL program object (coremltools.converters.mil.Program
). An MIL program is primarily used for debugging and inspection. It can be converted to an MLModel for execution by using one of the following:
ct.convert(mil_program, convert_to="neuralnetwork")
ct.convert(mil_program, convert_to="mlprogram")- If neither the
minimum_deployment_target
nor theconvert_to
parameter is specified, the converter produces the ML programs model type with as minimum of a deployment target as possible.
compute_precisioncoremltools.precision enumeration or ct.transform.FP16ComputePrecision() (optional)
Use this argument to control the storage precision of the tensors in the ML program. Must be one of the following.
coremltools.precision.FLOAT16
enum: The following transform is applied to produce a float 16 program; that is, a program in which all the intermediate float tensors are of type float 16 (for ops that support that type).
coremltools.transform.FP16ComputePrecision(op_selector=
lambda op:True)
The above transform iterates through all the ops, looking at each op’s inputs and outputs. If they are of type float 32, cast
ops are injected to convert those tensors (also known as vars) to type float 16. Similarly, int32 vars will also be cast to int16.
coremltools.precision.FLOAT32
enum: No transform is applied.
The original float32 tensor dtype in the source model is preserved. Opt into this option if the default converted model is displaying numerical precision issues.coremltools.transform.FP16ComputePrecision(op_selector=...)
Use this option to control which tensors are cast to float 16. Before casting the inputs/outputs of any op from float32 to float 16, the op_selector function is invoked on the op object. This function must return a boolean value. By default it returnsTrue
for every op, but you can customize this.
For example:
coremltools.transform.FP16ComputePrecision(op_selector=
lambda op: op.op_type != "linear")
The above casts all the float32 tensors to be float 16, except the input/output tensors to any linear
op. See more examples below.
None
: The default- When
convert_to="mlprogram"
, thecompute_precision
parameter defaults tocoremltools.precision.FLOAT16
. - When
convert_to="neuralnetwork"
, thecompute_precision
parameter needs to beNone
and has no meaning. - For example, you can customize the float 16 precision transform to prevent casting all the
real_div
ops in the program to float 16 precision:
def skip_real_div_ops(op):
if op.op_type == "real_div":
return False
return True
model = ct.convert(
source_model,
compute_precision=ct.transform.FP16ComputePrecision(op_selector=skip_real_div_ops),
minimum_deployment_target=ct.target.iOS15,
)
- When
skip_model_loadbool
Set to True
to prevent coremltools from calling into the Core ML framework to compile and load the model, post-conversion. In that case, the returned model object cannot be used to make a prediction, but can be used to save with model.save()
. This flag may be used to convert to a newer model type on an older Mac, which may raise a runtime warning if done without turning this flag on.
Example: Use this flag to suppress a runtime warning when converting to an ML program model on macOS 11, since an ML program can only be compiled and loaded from macOS12+.
Defaults to False
.
compute_units: coremltools.ComputeUnit
The set of processing units the model can use to make predictions. After conversion, the model is loaded with the provided set of compute units and returned.
An enum with the following possible values:
coremltools.ComputeUnit.ALL
: Use all compute units available, including the neural engine.coremltools.ComputeUnit.CPU_ONLY
: Limit the model to only use the CPU.coremltools.ComputeUnit.CPU_AND_GPU
: Use both the CPU and GPU, but not the neural engine.coremltools.ComputeUnit.CPU_AND_NE
: Use both the CPU and neural engine, but not the GPU. Available only for macOS >= 13.0.
package_dirstr
Post conversion, the model is saved at a temporary location and loaded to form the MLModel object ready for prediction.
- If
package_dir
is provided, model will be saved at this location rather than creating a temporary directory. - If not
None
, this must be a path to a directory with the extension.mlpackage
.
debugbool
This flag should generally be False
except for debugging purposes. Setting this flag to True
produces the following behavior:
- For Torch conversion, it will print the list of supported and unsupported ops found in the model if conversion fails due to an unsupported op.
- For Tensorflow conversion, it will cause to display extra logging and visualizations.
pass_pipelinePassPipeline
Manage graph passes. You can control which graph passes to run and the order of the graph passes. You can also specify options for each pass. See the details in the docstring of PassPipeline (coremltools/converters/mil/mil/passes/pass_pipeline.py
).
- To avoid fusing the
conv
andbatchnorm
ops, skip the corresponding pass as shown in the following example:
pipeline = ct.PassPipeline()
pipeline.remove_passes({"common::fuse_conv_batchnorm"})
mlmodel = ct.convert(model, pass_pipeline=pipeline) - To avoid folding too-large
const
ops that lead to a large model, set pass option as shown in the following example:
pipeline = ct.PassPipeline()
pipeline.set_options("common::const_elimination", {"skip_const_by_size": "1e6"})
mlmodel = ct.convert(model, pass_pipeline=pipeline)
We also provide a set of predefined pass pipelines that you can directly call.
- To avoid running all graph pass, you can use:
mlmodel = ct.convert(model, pass_pipeline=ct.PassPipeline.EMPTY) - To only run the cleanup graph passes, like constant_elimination, dead_code_elimination, etc. You can use:
mlmodel = ct.convert(model, pass_pipeline=ct.PassPipeline.CLEANUP) - To convert a source model with sparse weights to a sparse format Core ML model, you can use:
mlmodel = ct.convert(model, pass_pipeline=ct.PassPipeline.DEFAULT_PRUNING) - To convert a source model with palettized weights to a compressed format Core ML model, you can use:
mlmodel = ct.convert(model, pass_pipeline=ct.PassPipeline.DEFAULT_PALETTIZATION)
states:
Create a stateful mlprogram
model by providing the StateType
in the states
argument (for details see MIL Input Types). The stateful model is useful when converting a large language model with KV-Cache. The name of StateType
must match the key of the PyTorch named_buffers()
method in the source traced model.
The following example converts a torch model with a buffer called state_1
.
class UpdateBufferModel(torch.nn.Module): def init(self): super(UpdateBufferModel, self).init() self.register_buffer( "state_1", torch.tensor(np.array([0, 0, 0], dtype=np.float32)) )
def forward(self, x):
# In place update of the model state
self.state_1.add_(x)
return self.state_1
model = UpdateBufferModel() traced_model = torch.jit.trace(model, torch.tensor([1, 2, 3], dtype=torch.float32))
inputs = [ ct.TensorType(shape=(1, 2)), ] states = [ ct.StateType( wrapped_type=ct.TensorType( shape=(1, 2), ), name="state_1", ), ] mlmodel = ct.convert( traced_model, inputs=inputs, states=states, minimum_deployment_target=ct.target.iOS18, )
Returns:
modelcoremltools.models.MLModel
or coremltools.converters.mil.Program
A Core ML MLModel object or MIL program object (see convert_to
).
Examples
TensorFlow 1, 2 (model
is a frozen graph):
with tf.Graph().as_default() as graph: x = tf.placeholder(tf.float32, shape=(1, 2, 3), name="input") y = tf.nn.relu(x, name="output")
Automatically infer inputs and outputs:
mlmodel = ct.convert(graph) test_input = np.random.rand(1, 2, 3) - 0.5 results = mlmodel.predict({"input": test_input}) print(results['output'])
TensorFlow 2 (model
is a tf.Keras model path):
x = tf.keras.Input(shape=(32,), name='input') y = tf.keras.layers.Dense(16, activation='softmax')(x) keras_model = tf.keras.Model(x, y)
keras_model.save(h5_path) mlmodel = ct.convert(h5_path)
test_input = np.random.rand(2, 32) results = mlmodel.predict({'input': test_input}) print(results['Identity'])
PyTorch:
TorchScript Models:
model = torchvision.models.mobilenet_v2() model.eval() example_input = torch.rand(1, 3, 256, 256) traced_model = torch.jit.trace(model, example_input)
input = ct.TensorType(name='input_name', shape=(1, 3, 256, 256)) mlmodel = ct.convert(traced_model, inputs=[input]) results = mlmodel.predict({"input": example_input.numpy()}) print(results['1651']) # 1651 is the node name given by PyTorch's JIT
For more options see Conversion Options.