Neuron Documentation Release Notes — AWS Neuron Documentation (original) (raw)

Neuron Documentation Release Notes#

Table of contents

Neuron 2.21.0#

Date: 12/20/2024

Neuron Architectue and Features - Added Trainium2 Architectue guide. See Trainium2 Architecture- Added Trn2 Architecture guide. See Amazon EC2 Trn2 Architecture- Added Logical NeuronCore configuration guide. See Logical NeuronCore configuration- Added NeuronCore-v3 Architecture guide. See NeuronCore-v3 Architecture

Neuron Compiler - Added NKI tutorial for SPMD usage with multiple Neuron Cores on Trn2. See tutorial- Updated NKI FAQ with Trn2 FAQs. See NKI FAQ- Added Direct Allocation Developer Guide- Updated nki.isa API guide with support for new APIs. - Updated nki.language API guide with support for new APIs. - Updated nki.compiler API guide with support for new APIs. - Updated NKI datatype guide with support for float8_e5m2. - Updated kernels with support for allocated_fused_self_attn_for_SD_small_head_size and allocated_fused_rms_norm_qkv kernels

Neuron Runtime - Updated troubleshooting doc with information on device out-of-memory errors after upgrading to Neuron Driver 2.19 or later. See small_allocations_mempool

NeuronX Distributed Inference - Added Application Note to introduce NxD Inference. See Introducing NeuronX Distributed (NxD) Inference- Added NxD Inference Supported Features Guide. See NxD Inference Features Configuration Guide- Added NxD Inference Tutorial for Deploying Llama 3.1 405B (Trn2). See Tutorial: Deploying Llama3.1 405B (Trn2)- Added NxD Inference API Reference Guide. See nxd-inference-api-guides- Added NxD Inference Production Ready Models (Model Hub) Guide. See NxD Inference - Production Ready Models- Added Migration Guide from NxD examples to NxD Inference. See Migrating from NxD Core inference examples to NxD Inference- Added Migration Guide from Transformers NeuronX to NeuronX Distributed Inference. See Migrating from Transformers NeuronX to NeuronX Distributed(NxD) Inference- Added vLLM User Guide for NxD Inference. See vLLM User Guide for NxD Inference- Added tutorial for deploying Llama3.2 Multimodal Models. See Tutorial: Deploying Llama3.2 Multimodal Models

NeuronX Distributed Training - Updated Training APIs, Training Llama-3.1-70B, Llama-3-70B or Llama-2-13B/70B with Tensor Parallelism and Pipeline Parallelism, Training Llama-3.1-70B, Llama-3-70B or Llama-2-13B/70B with Tensor Parallelism and Pipeline Parallelism, YAML Configuration Settings, and Checkpoint Conversion with support for fused Q,K,V - Updated YAML Configuration Settings with support for Trn2 configuration API - UpdatedDirect Checkpoint Conversion with support for HuggingFace Model Conversion - Added tutorial for HuggingFace Llama3.1/Llama3-70B Pretraining. See HuggingFace Llama3.1/Llama3-70B Pretraining- Added tutorial for HuggingFace Llama3-8B Direct Preference Optimization (DPO) based Fine-tuning. See hf_llama3_8B_DPO

Transformers NeuronX - Updated Transformers NeuronX (transformers-neuronx) Developer Guide and PyTorch NeuronX Tracing API for Inference with support for CPU compilation. - Updated Transformers NeuronX (transformers-neuronx) Developer Guide to enable skipping the first Allgather introduced by flash decoding at the cost of duplicate Q weights. - Updated Transformers NeuronX (transformers-neuronx) Developer Guide with support for EAGLE speculation

Neuron Tools - Added Neuron Profiler 2.0 Beta User Guide with support for system profiles, integration with Perfetto, distributed workload support, etc. See Neuron Profiler 2.0 (Beta) User Guide- Updated nccom-test user guide to include support for Trn2. See NCCOM-TEST User Guide- Updated neuron-ls user guide to include support for Trn2. See Neuron LS User Guide- Updated neuron-monitor user guide to include support for Trn2. See Neuron Monitor User Guide- Updated neuron-top user guide to include support for Trn2. See Neuron Top User Guide- Added Ask Q Developer documentation for general Neuron guidance and jumpstarting NKI kernel developement. See Ask Q Developer

PyTorch NeuronX - Added troubleshooting note for eager debug mode errors. See PyTorch Neuron (torch-neuronx) for Training Troubleshooting Guide- Add torch-neuronx cxx11 ABI documentation. See Install with support for C++11 ABI- Added Migration Guide From XLA_USE_BF16/ XLA_DOWNCAST_BF16. See Migration From XLA_USE_BF16/XLA_DOWNCAST_BF16- Updated BERT tutorial to not use XLA_DOWNCAST_BF16 and updated BERT-Large pretraining phase to BFloat16 BERT-Large pretraining with AdamW and stochastic rounding. See Hugging Face BERT Pretraining Tutorial (Data-Parallel)- Added Appliation Note for PyTorch 2.5 support. See Introducing PyTorch 2.5 Support- Updated PyTorch NeuronX Environment Variables document with support for PyTorch 2.5. See PyTorch NeuronX Environment Variables

Misc - Added a third-party developer flow solutions page. See Third-party solutions- Added a third-party libraries page. See Third-party libraries

End of support announcements - Announcing end of support for Neuron DET tool starting next release- Announcing migration of NxD Core examples from NxD Core repository to NxD Inference repository in next release- Announcing end of support for Python 3.8 in future releases- Announcing end of support for PyTorch 1.13 starting next release- Announcing end of support for PyTorch 2.1 starting next release- Neuron no longer includes support for Ubuntu20 DLCs and DLAMIs starting this release- Announcing maintenance mode for torch-neuron 1.9 and 1.10 versions

Neuron 2.20.0#

Date: 09/16/2024

Neuron Compiler

NeuronX Distributing Training (NxDT)

NeuronX Distributed Core (NxD Core)

JAX Neuron

PyTorch NeuronX

Transformers NeuronX

Neuron Runtime

Containers

Neuron Tools

Software Maintenance and Misc

Neuron 2.19.0#

Date: 07/03/2024

Neuron 2.18.0#

Date: 04/01/2024

Neuron 2.16.0#

Date: 12/21/2023

Neuron 2.15.0#

Date: 10/26/2023

Known Issues and Limitations#

Following tutorials are currently not working. These tutorials will be updated once there is a fix.

Neuron 2.14.0#

Date: 09/15/2023

Neuron 2.13.0#

Date: 08/28/2023

Neuron 2.12.0#

Date: 07/19/2023

Neuron 2.11.0#

Date: 06/14/2023

This document is relevant for: Inf1, Inf2, Trn1, Trn2