Converting a TensorFlow 1 Image Classifier — Guide to Core ML Tools (original) (raw)

Converting a TensorFlow 1 Image Classifier#

The following example converts the TensorFlow Inception V1 image classifier to a Core ML classifier model that directly predicts the class label of the input image. It demonstrates the importance of setting the image preprocessing parameters correctly to get the right results.

Requirements#

This model requires TensorFlow 1, which is deprecated and difficult to install directly with pip. You can use the appropriate Miniconda installer for your operating system and create a Miniconda environment specifically for Python 3.7, and then use conda to install TensorFlow 1.15:

conda create -n tensorflow1-env python=3.7 conda activate tensorflow1-env conda install tensorflow==1.15

In addition, you need to install the following for this environment:

pip install -U coremltools pip install pillow conda install requests conda install matplotlib

Download the Model#

The following code downloads the Inception V1 frozen TF graph (the .pb file):

Download the model and class label package

from future import print_function import os, sys import tarfile

def download_file_and_unzip(url, dir_path='.'): """Download the frozen TensorFlow model and unzip it. url - The URL address of the frozen file dir_path - local directory """ if not os.path.exists(dir_path): os.makedirs(dir_path) k = url.rfind('/') fname = url[k+1:] fpath = os.path.join(dir_path, fname)

if not os.path.exists(fpath):
    if sys.version_info[0] < 3:
        import urllib
        urllib.urlretrieve(url, fpath)
    else:
        import urllib.request
        urllib.request.urlretrieve(url, fpath)

tar = tarfile.open(fpath)
tar.extractall(dir_path)
tar.close()

inception_v1_url = 'https://storage.googleapis.com/download.tensorflow.org/models/inception_v1_2016_08_28_frozen.pb.tar.gz' download_file_and_unzip(inception_v1_url)

Load the Graph Definition#

The following code loads the TensorFlow graph to find the input and output tensor names. You use them in the conversion process and for running the graph for a numerical accuracy check:

Load the TF graph definition

import tensorflow as tf # 1.x

tf_model_path = './inception_v1_2016_08_28_frozen.pb' with open(tf_model_path, 'rb') as f: serialized = f.read() tf.reset_default_graph() original_gdef = tf.GraphDef() original_gdef.ParseFromString(serialized)

Lets get some details about a few ops in the beginning and the end of the graph

with tf.Graph().as_default() as g: tf.import_graph_def(original_gdef, name='') ops = g.get_operations() N = len(ops) for i in [0,1,2,N-3,N-2,N-1]: print('\n\nop id {} : op type: "{}"'.format(str(i), ops[i].type)); print('input(s):'), for x in ops[i].inputs: print("name = {}, shape: {}, ".format(x.name, x.get_shape())), print('\noutput(s):'), for x in ops[i].outputs: print("name = {}, shape: {},".format(x.name, x.get_shape())),

If you run the code at this point, you can see that the output of the Placeholder op is the input (input:0), and the output of the Softmax op (near the end of the graph) is the output (InceptionV1/Logits/Predictions/Softmax:0).

Convert to Core ML#

The following code sets the image_inputs for inputs and the output name ('InceptionV1/Logits/Predictions/Softmax') for outputs in order to use them with the convert() method. The convert() method produces an ML program by default:

import coremltools as ct

image_inputs = ct.ImageType(shape=(1, 224, 224, 3)) classifier_config = ct.ClassifierConfig('imagenet_slim_labels.txt') coreml_model_file = './inception_v1.mlpackage' output = ['InceptionV1/Logits/Predictions/Softmax']

coreml_model = ct.convert(tf_model_path, inputs=[image_inputs], classifier_config=classifier_config, outputs=output)

coreml_model.save(coreml_model_file)

The result shows the progress of the conversion, but also includes the following warning:

UserWarning: Output, 'InceptionV1/Logits/Predictions/Softmax', of the source model, has been renamed to 'InceptionV1_Logits_Predictions_Softmax' in the Core ML model.

You will use the new name when making a prediction.

Load a Test Image#

To make predictions on the same image using both the original model and the converted model, right-click the following image and save it as Golden_Retriever_Carlos.jpg in the same folder as your Python project:

This image of a golden retriever is from Wikipedia.#

The following code loads the image:

Load an image

import numpy as np from PIL import Image img = Image.open("Golden_Retriever_Carlos.jpg")

Input the Image and Make a Prediction#

The following code passes the PIL image into the Core ML model after resizing it, and uses a NumPy array of the image to make a prediction. It also fixes the output name to use the renamed output ('InceptionV1_Logits_Predictions_Softmax') in the Core ML model:

img = img.resize([224,224], Image.LANCZOS) coreml_inputs = {'input': img}

Fix output name

output = ['InceptionV1_Logits_Predictions_Softmax']

coreml_output = coreml_model.predict(coreml_inputs) coreml_pred_dict = coreml_output[output[0]]

coreml_predicted_class_label = coreml_output['classLabel']

#for getting TF prediction we get the numpy array of the image img_np = np.array(img).astype(np.float32) print( 'image shape:', img_np.shape) print( 'first few values: ', img_np.flatten()[0:4], 'max value: ', np.amax(img_np)) img_tf = np.expand_dims(img_np, axis = 0) #now shape is [1,224,224,3] as required by TF

Evaluate TF and get the highest label

tf_input_name = 'input:0' tf_output_name = 'InceptionV1/Logits/Predictions/Softmax:0'

tf_output_name = 'InceptionV1_Logits_Predictions_Softmax:0'

img_tf = (2.0/255.0) * img_tf - 1 with tf.Session(graph = g) as sess: tf_out = sess.run(tf_output_name, feed_dict={tf_input_name: img_tf})

tf_out = tf_out.flatten() idx = np.argmax(tf_out) label_file = 'imagenet_slim_labels.txt' with open(label_file) as f: labels = f.readlines()

print TF prediction

print("TF prediction class = {}, probability = {}".format(labels[idx], str(tf_out[idx])))

#print Core ML prediction print('\n')

print("CoreML prediction class = {}, probability = {}".format(coreml_predicted_class_label, str(coreml_pred_dict[0])))

The result shows that both predictions match, which ensures that the conversion is correct. However, for better results, ensure that the image is preprocessed correctly before passing it to the ML program.

Preprocess the Image Before Converting#

Preprocessing is always a crucial step when using ML programs and neural networks on images. The best approach is to find the source of the pre-trained model and check for the preprocessing that the model’s author used during training and evaluation.

In this case, the TensorFlow model comes from theSLIM library, and the preprocessing steps are defined in the preprocess_for_eval definition in inception_preprocessing.py. The image pixels have to be scaled to lie within the interval [-1,1]. (“models/research/slim/preprocessing/inception_preprocessing.py”). The following code preprocesses the image and makes a new prediction:

img_tf = (2.0/255.0) * img_tf - 1 with tf.Session(graph = g) as sess: tf_out = sess.run(tf_output_name, feed_dict={tf_input_name: img_tf}) tf_out = tf_out.flatten()
idx = np.argmax(tf_out) print("TF prediction class = {}, probability = {}".format(labels[idx], str(tf_out[idx])))

The TensorFlow model predicts an English Setter as the highest class, (with a probability of 0.301507):

TF prediction class = English setter , probability = 0.301507

Core ML automatically handles the image preprocessing when the input is of type image. However, the image biases and scale are not correct. The channel scale should be multiplied first before adding the bias. The following code converts the model again with this correction, and saves the newly converted model. It also makes the prediction again with the newly converted Core ML model:

image_inputs = ct.ImageType(shape=(1, 224, 224, 3), bias=[-1,-1,-1], scale=2.0/255) classifier_config = ct.ClassifierConfig('imagenet_slim_labels.txt') coreml_model_file = './inception_v1.mlpackage' output = ['InceptionV1/Logits/Predictions/Softmax']

coreml_model = ct.convert(tf_model_path, inputs=[image_inputs], classifier_config=classifier_config, outputs=output)

coreml_model.save(coreml_model_file)

Call CoreML predict again

Fix output name

output = ['InceptionV1_Logits_Predictions_Softmax']

coreml_output = coreml_model.predict(coreml_inputs) coreml_pred_dict = coreml_output[output[0]] coreml_predicted_class_label = coreml_output['classLabel'] print("CoreML prediction class = {}, probability = {}".format(coreml_predicted_class_label, str(coreml_pred_dict[0])))

The output predicts the English Setter with higher probability (1.68707207e-04):

CoreML prediction class = English setter, probability = [1.68707207e-04 4.01963953e-05 2.33356332e-04 ... 1.15576135e-04 3.79885838e-04 2.21910377e-04]

Predictions Can Vary Slightly

Predictions with the default Core ML predict call may vary slightly, since by default it uses a lower-precision optimized path for faster execution. In previous versions of Core ML Tools, you would restrict execution to the CPU by specifying the useCPUOnly=True flag. This flag is now deprecated. Instead, use the compute_units parameter at load time or conversion time (that is, in coremltools.models.MLModel or convert(). For more information, see Set the compute units.