Amazon SageMaker AI model parallelism library v1 examples (original) (raw)

DocumentationAmazon SageMakerDeveloper Guide

Blogs and Case StudiesExample notebooks

This page provides a list of blogs and Jupyter notebooks that present practical examples of implementing the SageMaker model parallelism (SMP) library v1 to run distributed training jobs on SageMaker AI.

Blogs and Case Studies

The following blogs discuss case studies about using SMP v1.

Example notebooks

Example notebooks are provided in the SageMaker AI examples GitHub repository. To download the examples, run the following command to clone the repository and go totraining/distributed_training/pytorch/model_parallel.

Note

Clone and run the example notebooks in the following SageMaker AI ML IDEs.

git clone https://github.com/aws/amazon-sagemaker-examples.git
cd amazon-sagemaker-examples/training/distributed_training/pytorch/model_parallel

SMP v1 example notebooks for PyTorch

SMP v1 example notebooks for TensorFlow

Document Conventions

Checkpointing and Fine-Tuning a Model with Model Parallelism

Best Practices

Did this page help you? - Yes

Thanks for letting us know we're doing a good job!

If you've got a moment, please tell us what we did right so we can do more of it.

Did this page help you? - No

Thanks for letting us know this page needs work. We're sorry we let you down.

If you've got a moment, please tell us how we can make the documentation better.