trainNetwork - (Not recommended) Train neural network - MATLAB (original) (raw)

Main Content

(Not recommended) Train neural network

Syntax

Description

[net](#bu6sn4c-trainedNet) = trainNetwork([images](#mw%5F199f8be9-53f8-4bac-acdf-2846778f5904),[layers](#bu6sn4c%5Fsep%5Fmw%5Fa4be7938-74e9-410c-8d2d-98ae715a53af),[options](#bu6sn4c-options)) trains the neural network specified by layers for image classification and regression tasks using the images and responses specified byimages and the training options defined byoptions.

[net](#bu6sn4c-trainedNet) = trainNetwork([images](#mw%5F199f8be9-53f8-4bac-acdf-2846778f5904),[responses](#mw%5Fd0b3a2e4-09a0-42f9-a273-2bb25956fe66),[layers](#bu6sn4c%5Fsep%5Fmw%5Fa4be7938-74e9-410c-8d2d-98ae715a53af),[options](#bu6sn4c-options)) trains using the images specified by images and responses specified by responses.

[net](#bu6sn4c-trainedNet) = trainNetwork([sequences](#mw%5F36a68d96-8505-4b8d-b338-44e1efa9cc5e),[layers](#bu6sn4c%5Fsep%5Fmw%5Fa4be7938-74e9-410c-8d2d-98ae715a53af),[options](#bu6sn4c-options)) trains a neural network for sequence or time-series classification and regression tasks (for example, an LSTM or GRU neural network) using the sequences and responses specified by sequences.

[net](#bu6sn4c-trainedNet) = trainNetwork([sequences](#mw%5F36a68d96-8505-4b8d-b338-44e1efa9cc5e),[responses](#mw%5Fd0b3a2e4-09a0-42f9-a273-2bb25956fe66),[layers](#bu6sn4c%5Fsep%5Fmw%5Fa4be7938-74e9-410c-8d2d-98ae715a53af),[options](#bu6sn4c-options)) trains using the sequences specified by sequences and responses specified by responses.

example

[net](#bu6sn4c-trainedNet) = trainNetwork([features](#mw%5F9d0e11fb-fed4-441b-a0c6-05d8274571e1),[layers](#bu6sn4c%5Fsep%5Fmw%5Fa4be7938-74e9-410c-8d2d-98ae715a53af),[options](#bu6sn4c-options)) trains a neural network for feature classification or regression tasks (for example, a multilayer perceptron (MLP) neural network) using the feature data and responses specified by features.

[net](#bu6sn4c-trainedNet) = trainNetwork([features](#mw%5F9d0e11fb-fed4-441b-a0c6-05d8274571e1),[responses](#mw%5Fd0b3a2e4-09a0-42f9-a273-2bb25956fe66),[layers](#bu6sn4c%5Fsep%5Fmw%5Fa4be7938-74e9-410c-8d2d-98ae715a53af),[options](#bu6sn4c-options)) trains using the feature data specified by features and responses specified by responses.

[net](#bu6sn4c-trainedNet) = trainNetwork([mixed](#mw%5F9717db9c-a40f-41b1-aceb-29019274a2d0),[layers](#bu6sn4c%5Fsep%5Fmw%5Fa4be7938-74e9-410c-8d2d-98ae715a53af),[options](#bu6sn4c-options)) trains a neural network with multiple inputs with mixed data types with the data and responses specified by mixed.

[[net](#bu6sn4c-trainedNet),[info](#bu6sn4c-traininfo)] = trainNetwork(___) also returns information on the training using any of the previous syntaxes.

Examples

collapse all

Train Network for Sequence Classification

Train a deep learning LSTM network for sequence-to-label classification.

Load the example data from WaveformData.mat. The data is a numObservations-by-1 cell array of sequences, wherenumObservations is the number of sequences. Each sequence is anumChannels-by-numTimeSteps numeric array, where numChannels is the number of channels of the sequence and numTimeSteps is the number of time steps of the sequence.

Visualize some of the sequences in a plot.

numChannels = size(data{1},1);

idx = [3 4 5 12]; figure tiledlayout(2,2) for i = 1:4 nexttile stackedplot(data{idx(i)}', ... DisplayLabels="Channel " + string(1:numChannels))

xlabel("Time Step")
title("Class: " + string(labels(idx(i))))

end

Set aside data for testing. Partition the data into a training set containing 90% of the data and a test set containing the remaining 10% of the data. To partition the data, use thetrainingPartitions function, attached to this example as a supporting file. To access this file, open the example as a live script.

numObservations = numel(data); [idxTrain,idxTest] = trainingPartitions(numObservations, [0.9 0.1]); XTrain = data(idxTrain); TTrain = labels(idxTrain);

XTest = data(idxTest); TTest = labels(idxTest);

Define the LSTM network architecture. Specify the input size as the number of channels of the input data. Specify an LSTM layer to have 120 hidden units and to output the last element of the sequence. Finally, include a fully connected with an output size that matches the number of classes, followed by a softmax layer and a classification layer.

numHiddenUnits = 120; numClasses = numel(categories(TTrain));

layers = [ ... sequenceInputLayer(numChannels) lstmLayer(numHiddenUnits,OutputMode="last") fullyConnectedLayer(numClasses) softmaxLayer classificationLayer]

layers = 5×1 Layer array with layers:

 1   ''   Sequence Input          Sequence input with 3 dimensions
 2   ''   LSTM                    LSTM with 120 hidden units
 3   ''   Fully Connected         4 fully connected layer
 4   ''   Softmax                 softmax
 5   ''   Classification Output   crossentropyex

Specify the training options. Train using the Adam solver with a learn rate of 0.01 and a gradient threshold of 1. Set the maximum number of epochs to 150 and shuffle every epoch. The software, by default, trains on a GPU if one is available. Using a GPU requires Parallel Computing Toolbox™ and a supported GPU device. For information on supported devices, see GPU Computing Requirements (Parallel Computing Toolbox).

options = trainingOptions("adam", ... MaxEpochs=150, ... InitialLearnRate=0.01,... Shuffle="every-epoch", ... GradientThreshold=1, ... Verbose=false, ... Plots="training-progress");

Train the LSTM network with the specified training options.

net = trainNetwork(XTrain,TTrain,layers,options);

Classify the test data. Specify the same mini-batch size used for training.

YTest = classify(net,XTest);

Calculate the classification accuracy of the predictions.

acc = mean(YTest == TTest)

Display the classification results in a confusion chart.

figure confusionchart(TTest,YTest)

Input Arguments

collapse all

`images` — Image data

datastore | numeric array | table

Image data, specified as one of the following:

Data Type	Description	Example Usage
Datastore	ImageDatastore	Datastore of images saved on disk.	Train image classification neural network with images saved on disk, where the images are the same size.When the images are different sizes, use anAugmentedImageDatastore object.ImageDatastore objects support image classification tasks only. To use image datastores for regression neural networks, create a transformed or combined datastore that contains the images and responses using thetransform and combine functions, respectively.
AugmentedImageDatastore	Datastore that applies random affine geometric transformations, including resizing, rotation, reflection, shear, and translation.	Train image classification neural network with images saved on disk, where the images are different sizes.Train image classification neural network and generate new data using augmentations.
TransformedDatastore	Datastore that transforms batches of data read from an underlying datastore using a custom transformation function.	Train image regression neural network.Train neural networks with multiple inputs.Transform datastores with outputs not supported bytrainNetwork.Apply custom transformations to datastore output.
CombinedDatastore	Datastore that reads from two or more underlying datastores.	Train image regression neural network.Train neural networks with multiple inputs.Combine predictors and responses from different data sources.
PixelLabelImageDatastore (Computer Vision Toolbox)	Datastore that applies identical affine geometric transformations to images and corresponding pixel labels.	Train neural network for semantic segmentation.
RandomPatchExtractionDatastore (Image Processing Toolbox)	Datastore that extracts pairs of random patches from images or pixel label images and optionally applies identical random affine geometric transformations to the pairs.	Train neural network for object detection.
DenoisingImageDatastore (Image Processing Toolbox)	Datastore that applies randomly generated Gaussian noise.	Train neural network for image denoising.
Custom mini-batch datastore	Custom datastore that returns mini-batches of data.	Train neural network using data in a format that other datastores do not support.For details, see Develop Custom Mini-Batch Datastore.
Numeric array	Images specified as numeric array. If you specify images as a numeric array, then you must also specify the responses argument.	Train neural network using data that fits in memory and does not require additional processing like augmentation.
Table	Images specified as a table. If you specify images as a table, then you can also specify which columns contain the responses using the responses argument.	Train neural network using data stored in a table.

For neural networks with multiple inputs, the datastore must be a TransformedDatastore or CombinedDatastore object.

Tip

For sequences of images, for example video data, use thesequences input argument.

Datastore

Datastores read mini-batches of images and responses. Datastores are best suited when you have data that does not fit in memory or when you want to apply augmentations or transformations to the data.

The list below lists the datastores that are directly compatible withtrainNetwork for image data.

ImageDatastore
AugmentedImageDatastore
CombinedDatastore
TransformedDatastore
PixelLabelImageDatastore (Computer Vision Toolbox)
RandomPatchExtractionDatastore (Image Processing Toolbox)
DenoisingImageDatastore (Image Processing Toolbox)
Custom mini-batch datastore. For details, see Develop Custom Mini-Batch Datastore.

For example, you can create an image datastore using the imageDatastore function and use the names of the folders containing the images as labels by setting the 'LabelSource' option to'foldernames'. Alternatively, you can specify the labels manually using the Labels property of the image datastore.

Tip

Use augmentedImageDatastore for efficient preprocessing of images for deep learning, including image resizing. Do not use the ReadFcn option ofImageDatastore objects.

ImageDatastore allows batch reading of JPG or PNG image files using prefetching. If you set theReadFcn option to a custom function, thenImageDatastore does not prefetch and is usually significantly slower.

For neural networks with multiple inputs, the datastore must be aTransformedDatastore or CombinedDatastore object.

The required format of the datastore output depends on the neural network architecture.

Neural Network Architecture	Datastore Output	Example Output
Single input layer	Table or cell array with two columns.The first and second columns specify the predictors and targets, respectively.Table elements must be scalars, row vectors, or 1-by-1 cell arrays containing a numeric array.Custom mini-batch datastores must output tables.	Table for neural network with one input and one output:data = read(ds)data = 4×2 table Predictors Response __________________ ________ {224×224×3 double} 2 {224×224×3 double} 7 {224×224×3 double} 9 {224×224×3 double} 9
Cell array for neural network with one input and one output:data = read(ds)data = 4×2 cell array {224×224×3 double} {[2]} {224×224×3 double} {[7]} {224×224×3 double} {[9]} {224×224×3 double} {[9]}
Multiple input layers	Cell array with (numInputs + 1) columns, where numInputs is the number of neural network inputs.The first numInputs columns specify the predictors for each input and the last column specifies the targets.The order of inputs is given by the InputNames property of the layer graphlayers.	Cell array for neural network with two inputs and one output.data = read(ds)data = 4×3 cell array {224×224×3 double} {128×128×3 double} {[2]} {224×224×3 double} {128×128×3 double} {[2]} {224×224×3 double} {128×128×3 double} {[9]} {224×224×3 double} {128×128×3 double} {[9]}

The format of the predictors depends on the type of data.

Data	Format
2-D images	_h_-by-_w_-by-c numeric array, where h,w, and c are the height, width, and number of channels of the images, respectively.
3-D images	_h_-by-_w_-by-d_-by-c numeric array, where h,w, d, and_c are the height, width, depth, and number of channels of the images, respectively.

For predictors returned in tables, the elements must contain a numeric scalar, a numeric row vector, or a 1-by-1 cell array containing the numeric array.

The format of the responses depends on the type of task.

Task	Response Format
Image classification	Categorical scalar
Image regression	Numeric scalarNumeric vector3-D numeric array representing a 2-D image4-D numeric array representing a 3-D image

For responses returned in tables, the elements must be a categorical scalar, a numeric scalar, a numeric row vector, or a 1-by-1 cell array containing a numeric array.

For more information, see Datastores for Deep Learning.

Numeric Array

For data that fits in memory and does not require additional processing like augmentation, you can specify a data set of images as a numeric array. If you specify images as a numeric array, then you must also specify the responses argument.

The size and shape of the numeric array depends on the type of image data.

Data	Format
2-D images	_h_-by-_w_-by-_c_-by-N numeric array, where h,w, and c are the height, width, and number of channels of the images, respectively, and N is the number of images.
3-D images	_h_-by-_w_-by-_d_-by-c_-by-N numeric array, where h,w, d, and_c are the height, width, depth, and number of channels of the images, respectively, and N is the number of images.

Table

As an alternative to datastores or numeric arrays, you can also specify images and responses in a table. If you specify images as a table, then you can also specify which columns contain the responses using the responses argument.

When specifying images and responses in a table, each row in the table corresponds to an observation.

For image input, the predictors must be in the first column of the table, specified as one of the following:

Absolute or relative file path to an image, specified as a character vector
1-by-1 cell array containing a_h_-by-w_-by-c numeric array representing a 2-D image, where_h, w, and_c_ correspond to the height, width, and number of channels of the image, respectively.

The format of the responses depends on the type of task.

Task	Response Format
Image classification	Categorical scalar
Image regression	Numeric scalarTwo or more columns of scalar values1-by-1 cell array containing a_h_-by-_w_-by-c numeric array representing a 2-D image1-by-1 cell array containing a_h_-by-_w_-by-_d_-by-c numeric array representing a 3-D image

For neural networks with image input, if you do not specifyresponses, then the function, by default, uses the first column of tbl for the predictors and the subsequent columns as responses.

Tip

If the predictors or the responses containsNaNs, then they are propagated through the neural network during training. In these cases, the training usually fails to converge.
For regression tasks, normalizing the responses often helps to stabilize and speed up training of neural networks for regression. For more information, see Train Convolutional Neural Network for Regression.
This argument supports complex-valued predictors. To train a network with complex-valued predictors using thetrainNetwork function, theSplitComplexInputs option of the input layer must be 1 (true).

`sequences` — Sequence or time series data

datastore | cell array of numeric arrays | numeric array

Sequence or time series data, specified as one of the following:

Data Type	Description	Example Usage
Datastore	TransformedDatastore	Datastore that transforms batches of data read from an underlying datastore using a custom transformation function.	Transform datastores with outputs not supported bytrainNetwork.Apply custom transformations to datastore output.
CombinedDatastore	Datastore that reads from two or more underlying datastores.	Combine predictors and responses from different data sources.
Custom mini-batch datastore	Custom datastore that returns mini-batches of data.	Train neural network using data in a format that other datastores do not support.For details, see Develop Custom Mini-Batch Datastore.
Numeric or cell array	A single sequence specified as a numeric array or a data set of sequences specified as cell array of numeric arrays. If you specify sequences as a numeric or cell array, then you must also specify theresponses argument.	Train neural network using data that fits in memory and does not require additional processing like custom transformations.

Datastore

Datastores read mini-batches of sequences and responses. Datastores are best suited when you have data that does not fit in memory or when you want to apply transformations to the data.

The list below lists the datastores that are directly compatible withtrainNetwork for sequence data.

CombinedDatastore
TransformedDatastore
Custom mini-batch datastore. For details, see Develop Custom Mini-Batch Datastore.

You can use other built-in datastores for training deep learning neural networks by using the transform and combine functions. These functions can convert the data read from datastores to the table or cell array format required bytrainNetwork. For example, you can transform and combine data read from in-memory arrays and CSV files usingArrayDatastore andTabularTextDatastore objects, respectively.

The datastore must return data in a table or cell array. Custom mini-batch datastores must output tables.

Datastore Output	Example Output
Table	data = read(ds) data = 4×2 table Predictors Response __________________ ________ {12×50 double} 2 {12×50 double} 7 {12×50 double} 9 {12×50 double} 9
Cell array	data = read(ds) data = 4×2 cell array {12×50 double} {[2]} {12×50 double} {[7]} {12×50 double} {[9]} {12×50 double} {[9]}

The format of the predictors depend on the type of data.

Data	Format of Predictors
Vector sequence	c_-by-s matrix, where c is the number of features of the sequence and_s is the sequence length.
1-D image sequence	h_-by-c_-by-s array, where h and_c* correspond to the height and number of channels of the image, respectively, and_s* is the sequence length.Each sequence in the mini-batch must have the same sequence length.
2-D image sequence	_h_-by-_w_-by-c_-by-s array, where h,w, and c correspond to the height, width, and number of channels of the image, respectively, and_s is the sequence length.Each sequence in the mini-batch must have the same sequence length.
3-D image sequence	_h_-by-_w_-by-_d_-by-c_-by-s array, where h,w, d, and_c correspond to the height, width, depth, and number of channels of the image, respectively, and s is the sequence length.Each sequence in the mini-batch must have the same sequence length.

For predictors returned in tables, the elements must contain a numeric scalar, a numeric row vector, or a 1-by-1 cell array containing a numeric array.

The format of the responses depends on the type of task.

Task	Format of Responses
Sequence-to-label classification	Categorical scalar
Sequence-to-one regression	Scalar
Sequence-to-vector regression	Numeric row vector
Sequence-to-sequence classification	1-by-s sequence of categorical labels, where s is the sequence length of the corresponding predictor sequence.h_-by-w_-by-s sequence of categorical labels, where_h, w, and_s are the height, width, and sequence length of the corresponding predictor sequence, respectively._h_-by-_w_-by-d_-by-s sequence of categorical labels, where_h, w,d, and s are the height, width, depth, and sequence length of the corresponding predictor sequence, respectively.Each sequence in the mini-batch must have the same sequence length.
Sequence-to-sequence regression	_R_-by-s matrix, where R is the number of responses and s is the sequence length of the corresponding predictor sequence._h_-by-_w_-by-R_-by-s sequence of numeric responses, where_R is the number of responses , and h, w, and s are the height, width, and sequence length of the corresponding predictor sequence, respectively._h_-by-_w_-by-_d_-by-R_-by-s sequence of numeric responses, where_R is the number of responses , and h, w,d, and s are the height, width, depth, and sequence length of the corresponding predictor sequence, respectively.Each sequence in the mini-batch must have the same sequence length.

For responses returned in tables, the elements must be a categorical scalar, a numeric scalar, a numeric row vector, or a 1-by-1 cell array containing a numeric array.

For more information, see Datastores for Deep Learning.

Numeric or Cell Array

For data that fits in memory and does not require additional processing like custom transformations, you can specify a single sequence as a numeric array or a data set of sequences as a cell array of numeric arrays. If you specify sequences as a cell or numeric array, then you must also specify the responses argument.

For cell array input, the cell array must be an N_-by-1 cell array of numeric arrays, where_N is the number of observations. The size and shape of the numeric array representing a sequence depends on the type of sequence data.

Input	Description
Vector sequences	c_-by-s matrices, where c is the number of features of the sequences and_s is the sequence length.
1-D image sequences	h_-by-c_-by-s arrays, where h and_c* correspond to the height and number of channels of the images, respectively, and_s* is the sequence length.
2-D image sequences	_h_-by-_w_-by-c_-by-s arrays, where h,w, and c correspond to the height, width, and number of channels of the images, respectively, and_s is the sequence length.
3-D image sequences	_h_-by-_w_-by-_d_-by-c_-by-s, where h, w,d, and c correspond to the height, width, depth, and number of channels of the 3-D images, respectively, and_s is the sequence length.

The trainNetwork function supports neural networks with at most one sequence input layer.

Tip

If the predictors or the responses containsNaNs, then they are propagated through the neural network during training. In these cases, the training usually fails to converge.
For regression tasks, normalizing the responses often helps to stabilize and speed up training. For more information, see Train Convolutional Neural Network for Regression.
This argument supports complex-valued predictors. To train a network with complex-valued predictors using thetrainNetwork function, theSplitComplexInputs option of the input layer must be 1 (true).

`features` — Feature data

datastore | numeric array | table

Feature data, specified as one of the following:

Data Type	Description	Example Usage
Datastore	TransformedDatastore	Datastore that transforms batches of data read from an underlying datastore using a custom transformation function.	Train neural networks with multiple inputs.Transform datastores with outputs not supported bytrainNetwork.Apply custom transformations to datastore output.
CombinedDatastore	Datastore that reads from two or more underlying datastores.	Train neural networks with multiple inputs.Combine predictors and responses from different data sources.
Custom mini-batch datastore	Custom datastore that returns mini-batches of data.	Train neural network using data in a format that other datastores do not support.For details, see Develop Custom Mini-Batch Datastore.
Table	Feature data specified as a table. If you specify features as a table, then you can also specify which columns contain the responses using theresponses argument.	Train neural network using data stored in a table.
Numeric array	Feature data specified as numeric array. If you specify features as a numeric array, then you must also specify the responses argument.	Train neural network using data that fits in memory and does not require additional processing like custom transformations.

Datastore

Datastores read mini-batches of feature data and responses. Datastores are best suited when you have data that does not fit in memory or when you want to apply transformations to the data.

The list below lists the datastores that are directly compatible withtrainNetwork for feature data.

CombinedDatastore
TransformedDatastore
Custom mini-batch datastore. For details, see Develop Custom Mini-Batch Datastore.

You can use other built-in datastores for training deep learning neural networks by using the transform and combine functions. These functions can convert the data read from datastores to the table or cell array format required bytrainNetwork. For more information, see Datastores for Deep Learning.

For neural networks with multiple inputs, the datastore must be aTransformedDatastore or CombinedDatastore object.

The datastore must return data in a table or a cell array. Custom mini-batch datastores must output tables. The format of the datastore output depends on the neural network architecture.

Neural Network Architecture	Datastore Output	Example Output
Single input layer	Table or cell array with two columns.The first and second columns specify the predictors and responses, respectively.Table elements must be scalars, row vectors, or 1-by-1 cell arrays containing a numeric array.Custom mini-batch datastores must output tables.	Table for neural network with one input and one output:data = read(ds)data = 4×2 table Predictors Response __________________ ________ {24×1 double} 2 {24×1 double} 7 {24×1 double} 9 {24×1 double} 9
Cell array for neural network with one input and one output:data = read(ds) data = 4×2 cell array {24×1 double} {[2]} {24×1 double} {[7]} {24×1 double} {[9]} {24×1 double} {[9]}
Multiple input layers	Cell array with (numInputs + 1) columns, where numInputs is the number of neural network inputs.The first numInputs columns specify the predictors for each input and the last column specifies the responses.The order of inputs is given by the InputNames property of the layer graphlayers.	Cell array for neural network with two inputs and one output:data = read(ds)data = 4×3 cell array {24×1 double} {28×1 double} {[2]} {24×1 double} {28×1 double} {[2]} {24×1 double} {28×1 double} {[9]} {24×1 double} {28×1 double} {[9]}

The predictors must be c_-by-1 column vectors, where_c is the number of features.

The format of the responses depends on the type of task.

Task	Format of Responses
Classification	Categorical scalar
Regression	ScalarNumeric vector

For more information, see Datastores for Deep Learning.

Table

For feature data that fits in memory and does not require additional processing like custom transformations, you can specify feature data and responses as a table.

Each row in the table corresponds to an observation. The arrangement of predictors and responses in the table columns depends on the type of task.

Task	Predictors	Responses
Feature classification	Features specified in one or more columns as scalars.If you do not specify the responses argument, then the predictors must be in the firstnumFeatures columns of the table, where numFeatures is the number of features of the input data.	Categorical label
Feature regression	One or more columns of scalar values

For classification neural networks with feature input, if you do not specify the responses argument, then the function, by default, uses the first (numColumns - 1) columns of tbl for the predictors and the last column for the labels, where numFeatures is the number of features in the input data.

For regression neural networks with feature input, if you do not specify the responseNames argument, then the function, by default, uses the first numFeatures columns for the predictors and the subsequent columns for the responses, where numFeatures is the number of features in the input data.

Numeric Array

For feature data that fits in memory and does not require additional processing like custom transformations, you can specify feature data as a numeric array. If you specify feature data as a numeric array, then you must also specify the responses argument.

The numeric array must be an_N_-by-numFeatures numeric array, where N is the number of observations andnumFeatures is the number of features of the input data.

Tip

Normalizing the responses often helps to stabilize and speed up training of neural networks for regression. For more information, see Train Convolutional Neural Network for Regression.
Responses must not contain NaNs. If the predictor data contains NaNs, then they are propagated through the training. However, in most cases, the training fails to converge.
This argument supports complex-valued predictors. To train a network with complex-valued predictors using thetrainNetwork function, theSplitComplexInputs option of the input layer must be 1 (true).

`mixed` — Mixed data

datastore

Mixed data and responses, specified as one of the following:

Data Type	Description	Example Usage
TransformedDatastore	Datastore that transforms batches of data read from an underlying datastore using a custom transformation function.	Train neural networks with multiple inputs.Transform outputs of datastores not supported by trainNetwork to the have the required format.Apply custom transformations to datastore output.
CombinedDatastore	Datastore that reads from two or more underlying datastores.	Train neural networks with multiple inputs.Combine predictors and responses from different data sources.
Custom mini-batch datastore	Custom datastore that returns mini-batches of data.	Train neural network using data in a format that other datastores do not support.For details, see Develop Custom Mini-Batch Datastore.

You can use other built-in datastores for training deep learning neural networks by using the transform and combine functions. These functions can convert the data read from datastores to the table or cell array format required bytrainNetwork. For more information, see Datastores for Deep Learning.

The datastore must return data in a table or a cell array. Custom mini-batch datastores must output tables. The format of the datastore output depends on the neural network architecture.

Datastore Output	Example Output
Cell array with (numInputs + 1) columns, where numInputs is the number of neural network inputs.The firstnumInputs columns specify the predictors for each input and the last column specifies the responses.The order of inputs is given by the InputNames property of the layer graph layers.	data = read(ds) data = 4×3 cell array {24×1 double} {28×1 double} {[2]} {24×1 double} {28×1 double} {[2]} {24×1 double} {28×1 double} {[9]} {24×1 double} {28×1 double} {[9]}

For image, sequence, and feature predictor input, the format of the predictors must match the formats described in theimages, sequences, orfeatures argument descriptions, respectively. Similarly, the format of the responses must match the formats described in the images, sequences, orfeatures argument descriptions that corresponds to the type of task.

The trainNetwork function supports neural networks with at most one sequence input layer.

For an example showing how to train a neural network with multiple inputs, see Train Network on Image and Feature Data.

Tip

To convert a numeric array to a datastore, useArrayDatastore.
When combining layers in a neural network with mixed types of data, you may need to reformat the data before passing it to a combination layer (such as a concatenation or an addition layer). To reformat the data, you can use a flatten layer to flatten the spatial dimensions into the channel dimension, or create a FunctionLayer object or custom layer that reformats and reshapes.
This argument supports complex-valued predictors. To train a network with complex-valued predictors using thetrainNetwork function, theSplitComplexInputs option of the input layer must be 1 (true).

`responses` — Responses

Responses.

When the input data is a numeric array or a cell array, specify the responses as one of the following.

categorical vector of labels
numeric array of numeric responses
cell array of categorical or numeric sequences

When the input data is a table, you can optionally specify which columns of the table contains the responses as one of the following:

character vector
cell array of character vectors
string array

When the input data is a numeric array or a cell array, then the format of the responses depends on the type of task.

Task	Format
Classification	Image classification	_N_-by-1 categorical vector of labels, where N is the number of observations.
Feature classification
Sequence-to-label classification
Sequence-to-sequence classification	N_-by-1 cell array of categorical sequences of labels, where_N is the number of observations. Each sequence must have the same number of time steps as the corresponding predictor sequence.For sequence-to-sequence classification tasks with one observation,sequences can also be a vector. In this case, responses must be a categorical row-vector of labels.
Regression	2-D image regression	_N_-by-R matrix, where N is the number of images and R is the number of responses._h_-by-_w_-by-_c_-by-N numeric array, where h,w, and c are the height, width, and number of channels of the images, respectively, and N is the number of images.
3-D image regression	_N_-by-R matrix, where N is the number of images and R is the number of responses._h_-by-_w_-by-_d_-by-c_-by-N numeric array, where h,w, d, and_c are the height, width, depth, and number of channels of the images, respectively, and N is the number of images.
Feature regression	_N_-by-R matrix, where N is the number of observations and R is the number of responses.
Sequence-to-one regression	_N_-by-R matrix, where N is the number of sequences and R is the number of responses.
Sequence-to-sequence regression	_N_-by-1 cell array of numeric sequences, where N is the number of sequences, with sequences given by one of the following: _R_-by-s matrix, where R is the number of responses and s is the sequence length of the corresponding predictor sequence._h_-by-_w_-by-R_-by-s array, where h and_w are the height and width of the output, respectively, R is the number of responses, and s is the sequence length of the corresponding predictor sequence._h_-by-_w_-by-_d_-by-_R_-by-s array, where h,w, and d are the height, width, and depth of the output, respectively, R is the number of responses, and s is the sequence length of the corresponding predictor sequence.For sequence-to-sequence regression tasks with one observation,sequences can be a numeric array. In this case, responses must be a numeric array of responses.

Tip

Responses must not contain NaNs. If the predictor data contains NaNs, then they are propagated through the training. However, in most cases, the training fails to converge.

`layers` — Neural network layers

Layer array | LayerGraph object

Neural network layers, specified as a Layer array or a LayerGraph object.

To create a neural network with all layers connected sequentially, you can use aLayer array as the input argument. In this case, the returned neural network is a SeriesNetwork object.

A directed acyclic graph (DAG) neural network has a complex structure in which layers can have multiple inputs and outputs. To create a DAG neural network, specify the neural network architecture as a LayerGraph object and then use that layer graph as the input argument totrainNetwork.

The trainNetwork function supports neural networks with at most one sequence input layer.

For a list of built-in layers, see List of Deep Learning Layers.

`options` — Training options

TrainingOptionsSGDM | TrainingOptionsRMSProp | TrainingOptionsADAM

Training options, specified as a TrainingOptionsSGDM,TrainingOptionsRMSProp, orTrainingOptionsADAM object returned by the trainingOptions function.

Output Arguments

collapse all

`net` — Trained neural network

SeriesNetwork object | DAGNetwork object

Trained neural network, returned as a SeriesNetwork object or a DAGNetwork object.

If you train the neural network using a Layer array, thennet is a SeriesNetwork object. If you train the neural network using a LayerGraph object, then net is aDAGNetwork object.

`info` — Training information

structure

Training information, returned as a structure, where each field is a scalar or a numeric vector with one element per training iteration.

For classification tasks, info contains the following fields:

TrainingLoss — Loss function values
TrainingAccuracy — Training accuracies
ValidationLoss — Loss function values
ValidationAccuracy — Validation accuracies
BaseLearnRate — Learning rates
FinalValidationLoss — Validation loss of returned neural network
FinalValidationAccuracy — Validation accuracy of returned neural network
OutputNetworkIteration — Iteration number of returned neural network

For regression tasks, info contains the following fields:

TrainingLoss — Loss function values
TrainingRMSE — Training RMSE values
ValidationLoss — Loss function values
ValidationRMSE — Validation RMSE values
BaseLearnRate — Learning rates
FinalValidationLoss — Validation loss of returned neural network
FinalValidationRMSE — Validation RMSE of returned neural network
OutputNetworkIteration — Iteration number of returned neural network

The structure only contains the fields ValidationLoss,ValidationAccuracy, ValidationRMSE , FinalValidationLoss ,FinalValidationAccuracy, andFinalValidationRMSE when options specifies validation data. The ValidationFrequency training option determines which iterations the software calculates validation metrics. The final validation metrics are scalar. The other fields of the structure are row vectors, where each element corresponds to a training iteration. For iterations when the software does not calculate validation metrics, the corresponding values in the structure areNaN.

For neural networks containing batch normalization layers, if theBatchNormalizationStatistics training option is'population' then the final validation metrics are often different from the validation metrics evaluated during training. This is because batch normalization layers in the final neural network perform different operations than during training. For more information, see batchNormalizationLayer.

More About

collapse all

Save Checkpoint Neural Networks and Resume Training

Deep Learning Toolbox™ enables you to save neural networks as .mat files during training. This periodic saving is especially useful when you have a large neural network or a large data set, and training takes a long time. If the training is interrupted for some reason, you can resume training from the last saved checkpoint neural network. If you want the trainNetwork function to save checkpoint neural networks, then you must specify the name of the path by using theCheckpointPath option oftrainingOptions. If the path that you specify does not exist, then trainingOptions returns an error.

The software automatically assigns unique names to checkpoint neural network files. In the example name,net_checkpoint__351__2018_04_12__18_09_52.mat, 351 is the iteration number, 2018_04_12 is the date, and18_09_52 is the time at which the software saves the neural network. You can load a checkpoint neural network file by double-clicking it or using the load command at the command line. For example:

load net_checkpoint__351__2018_04_12__18_09_52.mat

You can then resume training by using the layers of the neural network as an input argument to trainNetwork. For example:

trainNetwork(XTrain,TTrain,net.Layers,options)

You must manually specify the training options and the input data, because the checkpoint neural network does not contain this information.

Floating-Point Arithmetic

When you train a neural network using the trainnet or trainNetwork functions, or when you use prediction or validation functions with DAGNetwork and SeriesNetwork objects, the software performs these computations using single-precision, floating-point arithmetic. Functions for prediction and validation include predict, classify, and activations. The software uses single-precision arithmetic when you train neural networks using both CPUs and GPUs.

Reproducibility

To provide the best performance, deep learning using a GPU in MATLAB® is not guaranteed to be deterministic. Depending on your network architecture, under some conditions you might get different results when using a GPU to train two identical networks or make two predictions using the same network and data.

Extended Capabilities

Automatic Parallel Support

Accelerate code by automatically running computation in parallel using Parallel Computing Toolbox™.

To run computation in parallel, set the ExecutionEnvironment training option to "multi-gpu" or "parallel".

Use trainingOptions to set theExecutionEnvironment training option and supply the options to trainNetwork. If you do not setExecutionEnvironment, then trainNetwork runs on a GPU if available.

For details, see Scale Up Deep Learning in Parallel, on GPUs, and in the Cloud.

GPU Arrays

Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

To prevent out-of-memory errors, recommended practice is not to move large sets of training data onto the GPU. Instead, train your neural network on a GPU by using trainingOptions to set theExecutionEnvironment to "auto" or"gpu" and supply the options totrainNetwork.
The ExecutionEnvironment option must be"auto" or "gpu" when the input data is:
- A gpuArray
- A cell array containing gpuArray objects
- A table containing gpuArray objects
- A datastore that outputs cell arrays containinggpuArray objects
- A datastore that outputs tables containinggpuArray objects

For more information, see Run MATLAB Functions on a GPU (Parallel Computing Toolbox).

Version History

Introduced in R2016a

expand all

R2024a: Not recommended

Starting in R2024a, the trainNetwork function is not recommended, use the trainnet function instead.

There are no plans to remove support for the trainNetwork function. However, the trainnet function has these advantages and is recommended instead:

trainnet supports dlnetwork objects, which support a wider range of network architectures that you can create or import from external platforms.
trainnet enables you to easily specify loss functions. You can select from built-in loss functions or specify a custom loss function.
trainnet outputs a dlnetwork object, which is a unified data type that supports network building, prediction, built-in training, visualization, compression, verification, and custom training loops.
trainnet is typically faster than trainNetwork.

This table shows some typical usages of the trainNetwork function and how to update your code to use the trainnet function instead.

Not Recommended	Recommended
net = trainNetwork(data,layers,options);	net = trainnet(data,layers,lossFcn,options);
net = trainNetwork(X,T,layers,options);	net = trainnet(X,T,layers,lossFcn,options);

Instead of using an output layer, specify a loss function usinglossFcn.

R2022b: `trainNetwork` pads mini-batches to length of longest sequence before splitting when you specify `SequenceLength` training option as an integer

Starting in R2022b, when you train a neural network with sequence data using the trainNetwork function and the SequenceLength option is an integer, the software pads sequences to the length of the longest sequence in each mini-batch and then splits the sequences into mini-batches with the specified sequence length. If SequenceLength does not evenly divide the sequence length of the mini-batch, then the last split mini-batch has a length shorter than SequenceLength. This behavior prevents the neural network training on time steps that contain only padding values.

In previous releases, the software pads mini-batches of sequences to have a length matching the nearest multiple of SequenceLength that is greater than or equal to the mini-batch length and then splits the data. To reproduce this behavior, use a custom training loop and implement this behavior when you preprocess mini-batches of data.

R2021b: `trainNetwork` automatically stops training when loss is `NaN`

When you train a neural network using the trainNetwork function, training automatically stops when the loss is NaN. Usually, a loss value of NaN introduces NaN values to the neural network learnable parameters, which in turn can cause the neural network to fail to train or to make valid predictions. This change helps identify issues with the neural network before training completes.

In previous releases, the neural network continues to train when the loss isNaN.

R2021a: Support for specifying tables of MAT file paths will be removed

When specifying sequence data for the trainNetwork function, support for specifying tables of MAT file paths will be removed in a future release.

To train neural networks with sequences that do not fit in memory, use a datastore. You can use any datastore to read your data and then use thetransform function to transform the datastore output to the format the trainNetwork function requires. For example, you can read data using a FileDatastore orTabularTextDatastore object then transform the output using thetransform function.

trainNetwork - (Not recommended) Train neural network - MATLAB (original) (raw)

Syntax

Description

Examples

Train Network for Sequence Classification

Input Arguments

images — Image data

Datastore

Numeric Array

Table

sequences — Sequence or time series data

Datastore

Numeric or Cell Array

features — Feature data

Datastore

Table

Numeric Array

mixed — Mixed data

responses — Responses

layers — Neural network layers

options — Training options

Output Arguments

net — Trained neural network

info — Training information

More About

Save Checkpoint Neural Networks and Resume Training

Floating-Point Arithmetic

Reproducibility

Extended Capabilities

Automatic Parallel Support

GPU Arrays

Version History

R2024a: Not recommended

R2022b: trainNetwork pads mini-batches to length of longest sequence before splitting when you specify SequenceLength training option as an integer

R2021b: trainNetwork automatically stops training when loss is NaN

R2021a: Support for specifying tables of MAT file paths will be removed

`images` — Image data

`sequences` — Sequence or time series data

`features` — Feature data

`mixed` — Mixed data

`responses` — Responses

`layers` — Neural network layers

`options` — Training options

`net` — Trained neural network

`info` — Training information

R2022b: `trainNetwork` pads mini-batches to length of longest sequence before splitting when you specify `SequenceLength` training option as an integer

R2021b: `trainNetwork` automatically stops training when loss is `NaN`