Neuron Glossary — AWS Neuron Documentation (original) (raw)

Terms
Abbreviations

This document is relevant for: Inf1, Inf2, Trn1, Trn2

Neuron Glossary#

Table of contents

Terms
Abbreviations

Terms #

Neuron Devices (Accelerated Machine Learning chips)#

Term	Description
Inferentia#	AWS first generation accelerated machine learning chip supporting inference only
Trainium/Inferentia2#	AWS second generation accelerated machine learning chip supporting training and inference
Trainium2#	AWS second generation accelerated machine learning chip supporting training and inference
Neuron Device#	Accelerated machine learning chip (e.g. Inferentia or Trainium)

Neuron powered Instances #

Term	Description
Inf1#	Inferentia powered accelerated compute EC2 instance
Trn1#	Trainium powered accelerated compute EC2 instance
Inf2#	Inferentia2 powered accelerated compute EC2 instance
Trn2#	Trainium2 powered accelerated compute EC2 instance

NeuronCore terms #

Term	Description
NeuronCore#	The machine learning compute cores within Inferentia/Trainium
NeuronCore-v1#	Neuron Core within Inferentia
NeuronCore-v2#	Neuron Core within Trainium1/Inferentia2
NeuronCore-v3#	Neuron Core within Trainium2
Tensor Engine#	2D systolic array (within the NeuronCore), used for matrix computations
Scalar Engine#	A scalar-engine within each NeuronCore, which can accelerate element-wise operations (e.g. GELU, ReLU, reciprocal, etc)
Vector Engine#	A vector-engine with each NeuronCore, which can accelerate spatial operations (e.g. layerNorm, TopK, pooling, etc)
GPSIMD Engine#	Embedded General Purpose SIMD cores, within each NeuronCore, to accelerate custom-operators
Sync Engine#	The SP engine, which is integrated inside NeuronCore. Used for synchronization and DMA triggering.
Collective Communication Engine#	Dedicated engine for collective communication, allows for overlapping computation and communication
High Bandwidth Memory#	High Bandwidth Memory, used as device memory for NeuronCore-v2 and beyond.
State Buffer#	The main software-managed on-chip memory in NeuronCore-v1 and beyond.
Partial Sum Buffer#	A second software-managed on-chip memory in NeuronCore-v1 and beyond, with near-memory accumulation support for TensorE output data.
NeuronLink#	Interconnect between NeuronCores
NeuronLink-v1#	Interconnect between NeuronCores in Inferentia device
NeuronLink-v2#	Interconnect between NeuronCores in Trainium1/Inferentia2 device
NeuronLink-v3#	Interconnect between NeuronCores in Trainium2 device

Neuron SDK terms #

Term	Description
Neuron Kernel Interface#	A bare-metal language and compiler for directly programming Neuron devices available on AWS Trainium/Inferentia2 and beyond devices.

Abbreviations #

Abbreviation	Description
NxD Core#	NeuronX Distributed Core Library
NxD Training#	NeuronX Distributed Training Library
NxD Inference#	NeuronX Distributed Inference Library
NC#	Neuron Core
NeuronCore#	Neuron Core
ND#	Neuron Device
NeuronDevice#	Neuron Device
TensorE#	Tensor Engine
ScalarE#	Scalar Engine
VectorE#	Vector Engine
GpSimdE#	GpSimd Engine
CCE#	Collective Communication Engine
HBM#	High Bandwidth Memory
SBUF#	State Buffer
PSUM#	Partial Sum Buffer
FP32#	Float32
TF32#	TensorFloat32
FP16#	Float16
BF16#	Bfloat16
cFP8#	Configurable Float8
RNE#	Round Nearest Even
SR#	Stochastic Rounding
NKI#	Neuron Kernel Interface
CustomOps#	Custom Operators
RT#	Neuron Runtime
DP#	Data Parallel
DPr#	Data Parallel degree
TP#	Tensor Parallel
TPr#	Tensor Parallel degree
PP#	Pipeline Parallel
PPr#	Pipeline Parallel degree

This document is relevant for: Inf1, Inf2, Trn1, Trn2

Neuron Glossary — AWS Neuron Documentation (original) (raw)

Contents

Neuron Glossary#

Terms#

Neuron Devices (Accelerated Machine Learning chips)#

Neuron powered Instances#

NeuronCore terms#

Neuron SDK terms#

Abbreviations#

Terms #

Neuron powered Instances #

NeuronCore terms #

Neuron SDK terms #

Abbreviations #