Neuron Compiler (neuronx-cc) release notes — AWS Neuron Documentation (original) (raw)

Neuron Compiler (neuronx-cc) release notes#

Neuron Compiler [2.17.194.0]#

Date: 04/03/2025

Neuron Compiler [2.16.372.0]#

Date: 01/14/2025

Neuron Compiler [2.16.345.0]#

Date: 12/20/2024

Neuron Compiler#

Date: 12/03/2024

Neuron Compiler [2.15.128.0]#

Date: 09/16/2024

Neuron Compiler [2.14.213.0]#

Date: 07/03/2024

Neuron Compiler [2.13.72.0]#

Date: 04/25/2024

Neuron Compiler [2.13.68.0]#

Date: 04/10/2024

Neuron Compiler [2.13.66.0]#

Date: 04/01/2024

Neuron Compiler [2.12.68.0]#

Date: 01/18/2024

Neuron Compiler [2.12.54.0]#

Date: 12/21/2023

When this occurs, users are encouraged to review the model’s gather/scatter input values to determine if there is a coding error.

Neuron Compiler [2.11.0.35]#

Date: 11/17/2023

Neuron Compiler [2.11.0.34]#

Date: 10/26/2023

Neuron Compiler [2.10.0.35]#

Date: 09/26/2023

There is still a known issue for some configurations of the model with the error “Too many instructions after unroll for function sg0000” . To mitigate this, recompile using the --optlevel 1 (-O1) option. A complete fix will be coming in the future release which will not require this option

Neuron Compiler [2.10.0.34]#

Date: 09/15/2023

Neuron Compiler [2.9.0.16]#

Date: 08/28/2023

Neuron Compiler [2.8.0.25]#

Date: 07/19/2023

Neuron Compiler [2.7.0.40]#

Date: 06/14/2023

Neuron Compiler [2.6.0.19]#

Date: 05/01/2023

Neuron Compiler [2.5.0.28]#

Date: 03/28/2023

Neuron Compiler [2.4.0.21]#

Date: 02/24/2023

Neuron Compiler [2.4.0.21]#

Date: 02/08/2023

Neuron Compiler [2.3.0.4]#

Date: 12/09/2022

Neuron Compiler [2.2.0.73]#

Date: 10/27/2022

Neuron Compiler [2.1.0.76]#

Date: 10/5/2022

The Neuron Compiler is an Ahead-of-Time compiler that accelerates models for execution on NeuronCores. This release supports compiling models for training on a Trn1 instance using Pytorch Neuron. Users typically access the compiler via the Framework to perform model compilation, although it can also be run as a command line tool (neuronx-cc).

The Neuron Compiler supports compiling models for mixed precision calculations. The trn1 hardware supports matrix multiplication using FP16, BF16, and FP32 on its Matrix Multiplication Engine, and accumulations using FP32. Operators such as activations or vector operations are supported using FP16, BF16, and FP32. Tensor transpose can be accomplished in FP16, BF16, FP32, or TF32 datatypes. By default, scalar and vector operations on FP32 values will be done in FP32, while matrix multiplications are cast to BF16 and transpose operations are cast to FP32. This default casting will generate the highest performance for a FP32 trained model.

By default, the compiler will target maximum performance by automatically casting the model to mixed precision. It also provides an option (--auto-cast) that allows the user to make tradeoffs between higher performance and optimal accuracy. The decision on what option argument to use with the --auto-cast option will be application specific. Compiler CLI options can be passed to the compiler via the framework.

Known issues#

Supported Operators#

The following XLA operators are supported by the Neuron Compiler. Future releases will broaden model support by providing additional XLA operators defined inhttps://www.tensorflow.org/xla/operation_semantics.

The list of supported operators can also be retrieved from the command line using neuronx-cc list-operators.

Supported XLA Operators Notes
Abs
Add
Allgather
Allreduce
Atan2
Batchnorm
Batchnormgrad
Batchnorminference
BitcastConvertType
Broadcast
BroadcastInDim
Ceil
Clamp
Compare
Concatenate
Constant
ConstantLiteral
ConvertElementType
Cos
Customcall
Div
Dot
DotGeneral
DynamicUpdateSlice Supports only for constant index
Eq
Exp
Floor
Gather Supports only disjoint start_index_map and remapped_offset_dims
Ge
GetTupleElement
Gt
Iota
Le
Log
LogicalAnd
LogicalNot
Lt
Max
Min
Mul
Ne
Neg
Pad
Pow Exponent argument must be a compile-time integer constant
Reduce Min, Max, Add and Mul are the only supported computations. Init_values must be constant
Reshape
Rev (reverse)
RngBitGenerator Ignores user seed
RngUniform
Rsqrt
Scatter
Select
SelectAndScatter
ShiftRightLogical
Sign
Sin
Slice
Sqrt
Sub
Tanh
Transpose
Tuple

This document is relevant for: Inf2, Trn1, Trn2