Built-in algorithms and pretrained models in Amazon SageMaker (original) (raw)

Amazon SageMaker provides a suite of built-in algorithms, pre-trained models, and pre-built solution templates to help data scientists and machine learning practitioners get started on training and deploying machine learning models quickly. For someone who is new to SageMaker, choosing the right algorithm for your particular use case can be a challenging task. The following table provides a quick cheat sheet that shows how you can start with an example problem or use case and find an appropriate built-in algorithm offered by SageMaker that is valid for that problem type. Additional guidance organized by learning paradigms (supervised and unsupervised) and important data domains (text and images) is provided in the sections following the table.

Table: Mapping use cases to built-in algorithms

Example problems and use cases Learning paradigm or domain Problem types Data input format Built-in algorithms
Here a few examples out of the 15 problem types that can be addressed by the pre-trained models and pre-built solution templates provided by SageMaker JumpStart: Question answering: chatbot that outputs an answer for a given question. Text analysis: analyze texts from models specific to an industry domain such as finance. Pre-trained models and pre-built solution templates Image Classification Tabular Classification Tabular Regression Text Classification Object Detection Text Embedding Question Answering Sentence Pair Classification Image Embedding Named Entity Recognition Instance Segmentation Text Generation Text Summarization Semantic Segmentation Machine Translation Image, Text, Tabular Popular models, including Mobilenet, YOLO, Faster R-CNN, BERT, lightGBM, and CatBoostFor a list of pre-trained models available, see JumpStart Models. For a list of pre-built solution templates available, see JumpStart Solutions.
Predict if an item belongs to a category: an email spam filter Supervised learning Binary/multi-class classification Tabular AutoGluon-Tabular,CatBoost, Factorization Machines Algorithm, K-Nearest Neighbors (k-NN) Algorithm,LightGBM, Linear Learner Algorithm, TabTransformer, XGBoost algorithm with Amazon SageMaker AI
Predict a numeric/continuous value: estimate the value of a house Regression Tabular AutoGluon-Tabular,CatBoost, Factorization Machines Algorithm, K-Nearest Neighbors (k-NN) Algorithm,LightGBM, Linear Learner Algorithm, TabTransformer, XGBoost algorithm with Amazon SageMaker AI
Based on historical data for a behavior, predict future behavior: predict sales on a new product based on previous sales data. Time-series forecasting Tabular Use the SageMaker AI DeepAR forecasting algorithm
Improve the data embeddings of the high-dimensional objects: identify duplicate support tickets or find the correct routing based on similarity of text in the tickets Embeddings: convert high-dimensional objects into low-dimensional space. Tabular Object2Vec Algorithm
Drop those columns from a dataset that have a weak relation with the label/target variable: the color of a car when predicting its mileage. Unsupervised learning Feature engineering: dimensionality reduction Tabular Principal Component Analysis (PCA) Algorithm
Detect abnormal behavior in application: spot when an IoT sensor is sending abnormal readings Anomaly detection Tabular Random Cut Forest (RCF) Algorithm
Protect your application from suspicious users: detect if an IP address accessing a service might be from a bad actor IP anomaly detection Tabular IP Insights
Group similar objects/data together: find high-, medium-, and low-spending customers from their transaction histories Clustering or grouping Tabular K-Means Algorithm
Organize a set of documents into topics (not known in advance): tag a document as belonging to a medical category based on the terms used in the document. Topic modeling Text Latent Dirichlet Allocation (LDA) Algorithm, Neural Topic Model (NTM) Algorithm
Assign pre-defined categories to documents in a corpus: categorize books in a library into academic disciplines Textual analysis Text classification Text BlazingText algorithm, Text Classification - TensorFlow
Convert text from one language to other: Spanish to English Machine translationalgorithm Text Sequence-to-Sequence Algorithm
Summarize a long text corpus: an abstract for a research paper Text summarization Text Sequence-to-Sequence Algorithm
Convert audio files to text: transcribe call center conversations for further analysis Speech-to-text Text Sequence-to-Sequence Algorithm
Label/tag an image based on the content of the image: alerts about adult content in an image Image processing Image and multi-label classification Image Image Classification - MXNet
Classify something in an image using transfer learning. Image classification Image Image Classification - TensorFlow
Detect people and objects in an image: police review a large photo gallery for a missing person Object detection and classification Image Object Detection - MXNet,Object Detection - TensorFlow
Tag every pixel of an image individually with a category: self-driving cars prepare to identify objects in their way Computer vision Image Semantic Segmentation Algorithm

For important information about the following items common to all of the built-in algorithms provided by SageMaker AI, see Parameters for Built-in Algorithms.

The following sections provide additional guidance for the Amazon SageMaker AI built-in algorithms grouped by the supervised and unsupervised learning paradigms to which they belong. For descriptions of these learning paradigms and their associated problem types, see Types of Algorithms. Sections are also provided for the SageMaker AI built-in algorithms available to address two important machine learning domains: textual analysis and image processing.

Pre-trained models and solution templates

SageMaker JumpStart provides a wide range of pre-trained models, pre-built solution templates, and examples for popular problem types. These use the SageMaker SDK as well as Studio Classic. For more information about these models, solutions, and the example notebooks provided by SageMaker JumpStart, see SageMaker JumpStart pretrained models.

Supervised learning

Amazon SageMaker AI provides several built-in general purpose algorithms that can be used for either classification or regression problems.

Amazon SageMaker AI also provides several built-in supervised learning algorithms used for more specialized tasks during feature engineering and forecasting from time series data.

Unsupervised learning

Amazon SageMaker AI provides several built-in algorithms that can be used for a variety of unsupervised learning tasks. These tasks includes things like clustering, dimension reduction, pattern recognition, and anomaly detection.

Textual analysis

SageMaker AI provides algorithms that are tailored to the analysis of textual documents. This includes text used in natural language processing, document classification or summarization, topic modeling or classification, and language transcription or translation.

Image processing

SageMaker AI also provides image processing algorithms that are used for image classification, object detection, and computer vision.

Topics