TensorFlow 2.x (tensorflow-neuron) Accelerated (torch-neuron) Python APIs and Graph Ops — AWS Neuron Documentation (original) (raw)

This document is relevant for: Inf1

TensorFlow 2.x (`tensorflow-neuron`) Accelerated (`torch-neuron`) Python APIs and Graph Ops#

This page lists TensorFlow 2.x Python APIs and graph operators that are accelerated by AWS Neuron. The lists are not exhaustive. TensorFlow 2.x Python APIs or graph operators that are not listed here may still be accelerated if they are composed of accelerated primitives, or they will be executed on CPU without significant acceleration. The TensorFlow Neuron integration contains an automatic operator-device-placement mechanism that strives to maximize the execution efficiency of your deep learning models on AWS Machine Learning ASIC instances.

Accelerated Python APIs#

Module	Accelerated Python API	Comments
tf	tf.abs
tf.add
tf.add_n
tf.broadcast_static_shape
tf.cast
tf.constant
tf.convert_to_tensor
tf.cumsum	axis must be a compile-time constant.
tf.einsum
tf.erf
tf.exp
tf.identity
tf.matmul	Uses float16/bfloat16 matmul with float32 accumulation.
tf.maximum
tf.minimum
tf.multiply
tf.negative
tf.range	start, limit and delta arguments must be compile-time constants.
tf.realdiv
tf.reciprocal
tf.reduce_all	axis must be a compile-time constant.
tf.reduce_any	axis must be a compile-time constant.
tf.reduce_max	axis must be a compile-time constant.
tf.reduce_min	axis must be a compile-time constant.
tf.reduce_prod	axis must be a compile-time constant.
tf.reduce_sum	axis must be a compile-time constant.
tf.reshape	shape argument must be a compile-time constant.
tf.rsqrt
tf.scalar_mul
tf.shape
tf.shape_n
tf.sigmoid
tf.size
tf.slice	size must be a compile-time constant. In addition, either begin must be a compile-time constant or size must be non-negative.
tf.sqrt
tf.square
tf.squared_difference
tf.squeeze
tf.stack
tf.stop_gradient
tf.strided_slice
tf.tanh
tf.tensordot
tf.to_bfloat16
tf.to_float
tf.truediv
tf.layers	tf.layers.batch_normalization
tf.layers.dense
tf.layers.flatten
tf.nn	tf.nn.batch_normalization
tf.nn.bias_add
tf.nn.dropout	Always treated as tf.identity during inference.
tf.nn.fused_batch_norm
tf.nn.leaky_relu
tf.nn.relu
tf.nn.relu6
tf.nn.relu_layer
tf.nn.softmax

Accelerated graph operators#

Add AddN AddV2 BatchMatMul BatchMatMulV2 BiasAdd Cast Const Cumsum Einsum Erf Exp ExpandDims FusedBatchNorm FusedBatchNormV2 FusedBatchNormV3 Greater Identity LeakyRelu MatMul Max Maximum Minimum Mean Mul Neg Pack RealDiv Relu Relu6 Reshape Rsqrt Sigmoid Softmax Split SplitV Sqrt Square SquaredDifference Squeeze StridedSlice Sub Sum Tanh Transpose Unpack

The lists share many commonalities with Available TensorFlow Ops. Portions of this page are modifications based on work created and shared by Google and used according to terms described in the Creative Commons 4.0 Attribution License.

This document is relevant for: Inf1

TensorFlow 2.x (tensorflow-neuron) Accelerated (torch-neuron) Python APIs and Graph Ops — AWS Neuron Documentation (original) (raw)