PyTorch — sagemaker 2.199.0 documentation (original) (raw)

sagemaker

PyTorch Estimator

class sagemaker.pytorch.estimator. PyTorch(entry_point, framework_version=None, py_version=None, source_dir=None, hyperparameters=None, image_uri=None, distribution=None, compiler_config=None, **kwargs)

Bases: sagemaker.estimator.Framework

Handle end-to-end training and deployment of custom PyTorch code.

This Estimator executes a PyTorch script in a managed PyTorch execution environment.

The managed PyTorch environment is an Amazon-built Docker container that executes functions defined in the supplied entry_point Python script within a SageMaker Training Job.

Training is started by callingfit() on this Estimator. After training is complete, callingdeploy() creates a hosted SageMaker endpoint and returns anPyTorchPredictor instance that can be used to perform inference against the hosted model.

Technical documentation on preparing PyTorch scripts for SageMaker training and using the PyTorch Estimator is available on the project home-page: https://github.com/aws/sagemaker-python-sdk

Parameters

       "enabled":True,  
       "parameters": {  
           "partitions": 2,  
           "microbatches": 4,  
           "placement_strategy": "spread",  
           "pipeline": "interleaved",  
           "optimize": "speed",  
           "ddp": True,  
       }  

},
"mpi": {
"enabled" : True,
"processes_per_host" : 8,
}
}

Note

The SageMaker distributed model parallel library internally uses MPI. In order to use model parallelism, MPI also must be enabled.
To enable PyTorch DDP:

To enable Torch Distributed:
This is available for general distributed training on GPU instances from PyTorch v1.13.1 and later.

{
"torch_distributed": {
"enabled": True
}
}

This option also supports distributed training on Trn1. To learn more, see Distributed PyTorch Training on Trainium.
To enable MPI:

To enable parameter server:

To enable distributed training with SageMaker Training Compiler:
{
"pytorchxla": {
"enabled": True
}
}

To learn more, see SageMaker Training Compilerin the Amazon SageMaker Developer Guide.

Note

When you use this PyTorch XLA option for distributed training strategy, you must add the compiler_config parameter and activate SageMaker Training Compiler.
compiler_config (TrainingCompilerConfig): Configures SageMaker Training Compiler to accelerate training.

LAUNCH_PYTORCH_DDP_ENV_NAME = 'sagemaker_pytorch_ddp_enabled'

LAUNCH_TORCH_DISTRIBUTED_ENV_NAME = 'sagemaker_torch_distributed_enabled'

INSTANCE_TYPE_ENV_NAME = 'sagemaker_instance_type'

hyperparameters()

Return hyperparameters used by your custom PyTorch code during model training.

create_model(model_server_workers=None, role=None, vpc_config_override='VPC_CONFIG_DEFAULT', entry_point=None, source_dir=None, dependencies=None, **kwargs)

Create a SageMaker PyTorchModel object that can be deployed to an Endpoint.

Parameters

Returns

A SageMaker PyTorchModelobject. See PyTorchModel() for full details.

Return type

sagemaker.pytorch.model.PyTorchModel

PyTorch Model

class sagemaker.pytorch.model. PyTorchModel(model_data, role=None, entry_point=None, framework_version='1.3', py_version=None, image_uri=None, predictor_cls=<class 'sagemaker.pytorch.model.PyTorchPredictor'>, model_server_workers=None, **kwargs)

Bases: sagemaker.model.FrameworkModel

An PyTorch SageMaker Model that can be deployed to a SageMaker Endpoint.

Initialize a PyTorchModel.

Parameters

Tip

You can find additional parameters for initializing this class atFrameworkModel andModel.

register(content_types=None, response_types=None, inference_instances=None, transform_instances=None, model_package_name=None, model_package_group_name=None, image_uri=None, model_metrics=None, metadata_properties=None, marketplace_cert=False, approval_status=None, description=None, drift_check_baselines=None, customer_metadata_properties=None, domain=None, sample_payload_url=None, task=None, framework=None, framework_version=None, nearest_model_name=None, data_input_configuration=None, skip_model_validation=None)

Creates a model package for creating SageMaker models or listing on Marketplace.

Parameters

Returns

A sagemaker.model.ModelPackage instance.

prepare_container_def(instance_type=None, accelerator_type=None, serverless_inference_config=None)

A container definition with framework configuration set in model environment variables.

Parameters

Returns

A container definition object usable with the CreateModel API.

Return type

dict[str, str]

serving_image_uri(region_name, instance_type, accelerator_type=None, serverless_inference_config=None)

Create a URI for the serving image.

Parameters

Returns

The appropriate image URI based on the given parameters.

Return type

str

PyTorch Predictor

class sagemaker.pytorch.model. PyTorchPredictor(endpoint_name, sagemaker_session=None, serializer=<sagemaker.base_serializers.NumpySerializer object>, deserializer=<sagemaker.base_deserializers.NumpyDeserializer object>, component_name=None)

Bases: sagemaker.base_predictor.Predictor

A Predictor for inference against PyTorch Endpoints.

This is able to serialize Python lists, dictionaries, and numpy arrays to multidimensional tensors for PyTorch inference.

Initialize an PyTorchPredictor.

Parameters

PyTorch Processor

class sagemaker.pytorch.processing. PyTorchProcessor(framework_version, role=None, instance_count=None, instance_type=None, py_version='py3', image_uri=None, command=None, volume_size_in_gb=30, volume_kms_key=None, output_kms_key=None, code_location=None, max_runtime_in_seconds=None, base_job_name=None, sagemaker_session=None, env=None, tags=None, network_config=None)

Bases: sagemaker.processing.FrameworkProcessor

Handles Amazon SageMaker processing tasks for jobs using PyTorch containers.

This processor executes a Python script in a PyTorch execution environment.

Unless image_uri is specified, the PyTorch environment is an Amazon-built Docker container that executes functions defined in the suppliedcode Python script.

The arguments have the exact same meaning as in FrameworkProcessor.

Parameters

estimator_cls

alias of sagemaker.pytorch.estimator.PyTorch