Inputs — sagemaker 2.247.0 documentation (original) (raw)

Amazon SageMaker channel configurations for S3 data sources and file system data sources

class sagemaker.inputs.TrainingInput(s3_data, distribution=None, compression=None, content_type=None, record_wrapping=None, s3_data_type='S3Prefix', instance_groups=None, input_mode=None, attribute_names=None, target_attribute_name=None, shuffle_config=None, hub_access_config=None, model_access_config=None)

Bases: object

Amazon SageMaker channel configurations for S3 data sources.

config

A SageMaker DataSource referencing a SageMaker S3DataSource.

Type:

dict[str, dict]

Create a definition for input data used by an SageMaker training job.

See AWS documentation on the CreateTrainingJob API for more details on the parameters.

Parameters:

add_hub_access_config(hub_access_config=None)

Add Hub Access Config to the channel’s configuration.

Parameters:

add_model_access_config(model_access_config=None)

Add Model Access Config to the channel’s configuration.

Parameters:

model_access_config (dict) – Whether model terms of use have been accepted.

class sagemaker.inputs.ShuffleConfig(seed)

Bases: object

For configuring channel shuffling using a seed.

For more detail, see the AWS documentation:https://docs.aws.amazon.com/sagemaker/latest/dg/API_ShuffleConfig.html

Create a ShuffleConfig.

Parameters:

seed (long) – the long value used to seed the shuffled sequence.

class sagemaker.inputs.CreateModelInput(instance_type=None, accelerator_type=None)

Bases: object

A class containing parameters which can be used to create a SageMaker Model

Parameters:

Method generated by attrs for class CreateModelInput.

instance_type_: str_

accelerator_type_: str_

class sagemaker.inputs.TransformInput(data, data_type='S3Prefix', content_type=None, compression_type=None, split_type=None, input_filter=None, output_filter=None, join_source=None, model_client_config=None, batch_data_capture_config=None)

Bases: object

Creates a class containing parameters for configuring input data for a batch tramsform job.

It can be used when calling sagemaker.transformer.Transformer.transform()

Parameters:

Method generated by attrs for class TransformInput.

data_: str_

data_type_: str_

content_type_: str_

compression_type_: str_

split_type_: str_

input_filter_: str_

output_filter_: str_

join_source_: str_

model_client_config_: dict_

batch_data_capture_config_: dict_

class sagemaker.inputs.FileSystemInput(file_system_id, file_system_type, directory_path, file_system_access_mode='ro', content_type=None)

Bases: object

Amazon SageMaker channel configurations for file system data sources.

config

A Sagemaker File System DataSource.

Type:

dict[str, dict]

Create a new file system input used by an SageMaker training job.

Parameters:

class sagemaker.inputs.BatchDataCaptureConfig(destination_s3_uri, kms_key_id=None, generate_inference_id=None)

Bases: object

Configuration object passed in when create a batch transform job.

Specifies configuration related to batch transform job data capture for use with Amazon SageMaker Model Monitoring

Create new BatchDataCaptureConfig

Parameters:

The input configs for DatasetDefinition.

DatasetDefinition supports the data sources like S3 which can be queried via Athena and Redshift. A mechanism has to be created for customers to generate datasets from Athena/Redshift queries and to retrieve the data, using Processing jobs so as to make it available for other downstream processes.

class sagemaker.dataset_definition.inputs.RedshiftDatasetDefinition(cluster_id=None, database=None, db_user=None, query_string=None, cluster_role_arn=None, output_s3_uri=None, kms_key_id=None, output_format=None, output_compression=None)

Bases: ApiObject

DatasetDefinition for Redshift.

With this input, SQL queries will be executed using Redshift to generate datasets to S3.

Initialize RedshiftDatasetDefinition.

Parameters:

class sagemaker.dataset_definition.inputs.AthenaDatasetDefinition(catalog=None, database=None, query_string=None, output_s3_uri=None, work_group=None, kms_key_id=None, output_format=None, output_compression=None)

Bases: ApiObject

DatasetDefinition for Athena.

With this input, SQL queries will be executed using Athena to generate datasets to S3.

Initialize AthenaDatasetDefinition.

Parameters:

class sagemaker.dataset_definition.inputs.DatasetDefinition(data_distribution_type='ShardedByS3Key', input_mode='File', local_path=None, redshift_dataset_definition=None, athena_dataset_definition=None)

Bases: ApiObject

DatasetDefinition input.

Initialize DatasetDefinition.

Parameters:

class sagemaker.dataset_definition.inputs.S3Input(s3_uri=None, local_path=None, s3_data_type='S3Prefix', s3_input_mode='File', s3_data_distribution_type='FullyReplicated', s3_compression_type=None)

Bases: ApiObject

Metadata of data objects stored in S3.

Two options are provided: specifying a S3 prefix or by explicitly listing the files in a manifest file and referencing the manifest file’s S3 path. Note: Strong consistency is not guaranteed if S3Prefix is provided here. S3 list operations are not strongly consistent. Use ManifestFile if strong consistency is required.

Initialize S3Input.

Parameters: