Machine learning - Amazon Redshift (original) (raw)

Amazon Redshift machine learning (Amazon Redshift ML) is a robust, cloud-based service that makes it easier for analysts and data scientists of all skill levels to use machine learning technology. Amazon Redshift ML uses a model to generate results. You can use models in the following ways:

Note

Opting out of using your data for service improvement

If you are using Amazon Bedrock models, and you don't want AWS to process your data for service improvement purposes, you must enable the Opt-Out policy for Amazon Bedrock.

Note

LLMs can generate inaccurate or incomplete information. We recommend verifying the information that LLMs produce to ensure that it is accurate and complete.

How Amazon Redshift ML works with Amazon SageMaker

Amazon Redshift works with Amazon SageMaker Autopilot to automatically obtain the best model and make the prediction function available in Amazon Redshift.

The following diagram illustrates how Amazon Redshift ML works.

Workflow for Amazon Redshift ML integrating with Amazon SageMaker Autopilot.

The general workflow is as follows:

  1. Amazon Redshift exports the training data into Amazon S3.
  2. Amazon SageMaker Autopilot preprocesses the training data. Preprocessing performs important functions, such as imputing missing values. It recognizes that certain columns are categorical (such as the postal code), properly formats them for training, and performs numerous other tasks. Choosing the best preprocessors to apply on the training dataset is a problem in itself, and Amazon SageMaker Autopilot automates its solution.
  3. Amazon SageMaker Autopilot finds the algorithm and algorithm hyperparameters that deliver the model with the most accurate predictions.
  4. Amazon Redshift registers the prediction function as a SQL function in your Amazon Redshift cluster.
  5. When you run CREATE MODEL statements, Amazon Redshift uses Amazon SageMaker for training. Therefore, there is an associated cost for training your model. This is a separate line item for Amazon SageMaker in your AWS bill. You also pay for the storage used in Amazon S3 for storing your training data. Inference using models created with CREATE MODEL that you can compile and run on your Redshift cluster aren't charged. There are no additional Amazon Redshift charges for using Amazon Redshift ML.
Topics