Training Samples/Tutorials (Trn1/Trn1n) — AWS Neuron Documentation (original) (raw)

This document is relevant for: Trn1

Training Samples/Tutorials (Trn1/Trn1n)#

Table of contents

Encoders#

Model Frameworks/Libraries Samples and Tutorials
bert-base-cased torch-neuronx Fine-tune a “bert-base-cased” PyTorch model for Text Classification How to fine-tune a “bert base cased” PyTorch model with AWS Trainium (Trn1 instances) for Sentiment Analysis
bert-base-uncased torch-neuronx Fine-tune a “bert-base-uncased” PyTorch model Fine tuning BERT base model from HuggingFace on Amazon SageMaker
bert-large-cased torch-neuronx Fine-tune a “bert-large-cased” PyTorch model
bert-large-uncased torch-neuronx Hugging Face BERT Pretraining Tutorial (Data-Parallel) Launch Bert Large Phase 1 pretraining job on Parallel Cluster Launch a Multi-Node PyTorch Neuron Training Job on Trainium Using TorchX and EKS PyTorch Neuron for Trainium Hugging Face BERT MRPC task finetuning using Hugging Face Trainer API Fine-tune a “bert-large-uncased” PyTorch model
roberta-base tensorflow-neuronx Fine-tune a “roberta-base” PyTorch model
roberta-large torch-neuronx Fine-tune a “roberta-large” PyTorch model
xlm-roberta-base torch-neuronx Fine-tune a “xlm-roberta-base” PyTorch model
alberta-base-v2 torch-neuronx Fine-tune a “alberta-base-v2” PyTorch model
distilbert-base-uncased torch-neuronx Fine-tune a “distilbert-base-uncased” PyTorch model
camembert-base torch-neuronx Fine-tune a “camembert-base PyTorch model
cl-tohoku/bert-base-japanese-whole-word-masking torch-neuronx Fine-tuning & Deployment Hugging Face BERT Japanese model

Decoders#

Model Frameworks/Libraries Samples and Tutorials
gpt-2 nxd-training Megatron GPT Pretraining
gpt-2 torch-neuronx How to run training jobs for “gpt2” PyTorch model with AWS Trainium ZeRO-1 Tutorial
gpt-3 neuronx-nemo-megatron Launch a GPT-3 23B pretraining job using neuronx-nemo-megatron Launch a GPT-3 46B pretraining job using neuronx-nemo-megatron Launch a GPT-3 175B pretraining job using neuronx-nemo-megatron
GPT-NEOX-20B neuronx-distributed Training GPT-NeoX 20B with Tensor Parallelism and ZeRO-1 Optimizer Training GPT-NEOX 20B model using neuronx-distributed Pre-train GPT Neox 20b on Wikicorpus dataset using Neuronx Distributed library
GPT-NEOX-6.9B neuronx-distributed Training GPT-NeoX 6.9B with Tensor Parallelism and ZeRO-1 Optimizer Training GPT-NEOX 6.9B model using neuronx-distributed Pre-train GPT Neox 6.9b on Wikicorpus dataset using Neuronx Distributed library
meta-llama/Llama-3.1-70b neuronx-distributed Training Llama-3.1-70B, Llama-3-70B or Llama-2-13B/70B with Tensor Parallelism and Pipeline Parallelism
meta-llama/Llama-3.1-8b neuronx-distributed Training Llama3.1-8B, Llama3-8B and Llama2-7B with Tensor Parallelism and ZeRO-1 Optimizer
meta-llama/Llama-3-70b neuronx-distributed Training Llama-3.1-70B, Llama-3-70B or Llama-2-13B/70B with Tensor Parallelism and Pipeline Parallelism
meta-llama/Llama-3-8b nxd-training HuggingFace Llama3.1/Llama3-8B Pretraining HuggingFace Llama3.1/Llama3-8B Supervised Fine-tuning
meta-llama/Llama-3-8b neuronx-distributed Training Llama3 8B Model with Tensor Parallelism and ZeRO-1 Optimizer Tutorial for Fine-tuning Llama3 8B with tensor parallelism and LoRA using Neuron PyTorch-Lightning with NeuronX Distributed
meta-llama/Llama-2-7b neuronx-distributed Training Llama3.1-8B, Llama3-8B and Llama2-7B with Tensor Parallelism and ZeRO-1 Optimizer Training Llama2 7B Model with AWS Batch and Trainium Fine-tuning Llama2 7B with tensor parallelism and ZeRO-1 optimizer using Neuron PyTorch-Lightning Pre-train Llama2-7B on Wikicorpus dataset using Neuronx Distributed library
meta-llama/Llama-2-13b neuronx-distributed Training Llama-3.1-70B, Llama-3-70B or Llama-2-13B/70B with Tensor Parallelism and Pipeline Parallelism
meta-llama/Llama-2-70b neuronx-distributed Training Llama-3.1-70B, Llama-3-70B or Llama-2-13B/70B with Tensor Parallelism and Pipeline Parallelism
codegen25-7b-mono neuronx-distributed Training CodeGen2.5 7B with Tensor Parallelism and ZeRO-1 Optimizer
meta-llama/Llama-2 neuronx-nemo-megatron Launch a Llama-2-7B pretraining job using neuronx-nemo-megatron Launch a Llama-2-13B pretraining job using neuronx-nemo-megatron Launch a Llama-2-70B pretraining job using neuronx-nemo-megatron
Mistral-7B neuronx-nemo-megatron Training Mistral-7B

Encoder-Decoders#

Model Frameworks/Libraries Samples and Tutorials
t5-small torch-neuronx optimum-neuron Fine-tune T5 model on Trn1
facebook/bart-large torch-neuronx How to fine-tune a “Bart-Large” PyTorch model with AWS Trainium (trn1 instances)

Vision Transformers#

Model Frameworks/Libraries Samples and Tutorials
google/vit-base-patch16-224-in21k torch-neuronx Fine-tune a pretrained HuggingFace vision transformer PyTorch model
openai/clip-vit-base-patch32 torch-neuronx Fine-tune a pretrained HuggingFace CLIP-base PyTorch model with AWS Trainium
openai/clip-vit-large-patch14 torch-neuronx Fine-tune a pretrained HuggingFace CLIP-large PyTorch model with AWS Trainium

Stable Diffusion#

Model Frameworks/Libraries Samples and Tutorials
stabilityai/stable-diffusion-2-1-base torch-neuronx [Beta] Train stabilityai/stable-diffusion-2-1-base with AWS Trainium (trn1 instances)
runwayml/stable-diffusion-v1-5 torch-neuronx [Beta] Train runwayml/stable-diffusion-v1-5 with AWS Trainium (trn1 instances)

Multi Modal#

Model Frameworks/Libraries Samples and Tutorials
language-perceiver torch-neuronx How to fine-tune a “language perceiver” PyTorch model with AWS Trainium (trn1 instances)
vision-perceiver-conv torch-neuronx How to fine-tune a pretrained HuggingFace Vision Perceiver Conv

Convolutional Neural Networks(CNN)#

Model Frameworks/Libraries Samples and Tutorials
resnet50 torch-neuronx How to fine-tune a pretrained ResNet50 Pytorch model with AWS Trainium (trn1 instances) using NeuronSDK
milesial/Pytorch-UNet torch-neuronx This notebook shows how to fine-tune a pretrained UNET PyTorch model with AWS Trainium (trn1 instances) using NeuronSDK.

This document is relevant for: Trn1