TensorFlow 1.x (tensorflow-neuron) Compilation API — AWS Neuron Documentation (original) (raw)

This document is relevant for: Inf1

TensorFlow 1.x (tensorflow-neuron) Compilation API#

The Neuron compilation API for TensorFlow 1.x enables compilation of saved model to an Inferentia target.

Method#

tensorflow.neuron.saved_model.compile

Description#

Within the graph or subgraph, the compile method selects and send Neuron-supported operations to Neuron-Compiler for compilation and saves the compiled artifacts in the graph. Uncompilable operations are kept as original operations for framework execution.

The compiled graph can be exported to saved model and served using TensorFlow Serving. Please see tensorflow-serving for more information about exporting to saved model and serving using TensorFlow Serving.

Options can be passed to Neuron compiler via the compile function. For example, the “--neuroncore-pipeline-cores” option directs Neuron compiler to compile each subgraph to fit in the specified number of NeuronCores. This number can be less than the total available NeuronCores on an Inf1 instance. See Neuron compiler CLI Reference Guide (neuron-cc)for more information about compiler options.

Arguments#

Returns#

INFO:tensorflow:Number of operations in TensorFlow session: 3978 INFO:tensorflow:Number of operations after tf.neuron optimizations: 555 INFO:tensorflow:Number of operations placed on Neuron runtime: 554

Example Usage#

import shutil import tensorflow.neuron as tfn saved_model_path = "" compiled_saved_model_path = "" shutil.rmtree(compiled_saved_model_path, ignore_errors=True) tfn.saved_model.compile(saved_model_path, compiled_saved_model_path)

This document is relevant for: Inf1