Define Custom Classification Output Layer - MATLAB & Simulink (original) (raw)
Tip
Custom output layers are not recommended, use the trainnet function and specify a custom loss function instead. To specify a custom backward function for the loss function, use a deep.DifferentiableFunction
object. For more information, see Define Custom Deep Learning Operations.
To train a neural network using entropy loss for k mutually exclusive classes, use the trainnet
function and specify the loss function"crossentropy"
.
If you want to use a different loss function for your classification problems when you use the trainNetwork
function, then you can define a custom classification output layer using this example as a guide. This example shows how to define a custom classification output layer with the sum of squares error (SSE) loss and use it in a convolutional neural network.
To define a custom classification output layer, you can use the template provided in this example, which takes you through the following steps:
- Name the layer – Give the layer a name so it can be used in MATLAB®.
- Declare the layer properties – Specify the properties of the layer.
- Create a constructor function (optional) – Specify how to construct the layer and initialize its properties. If you do not specify a constructor function, then the software initializes the properties with
''
at creation. - Create a forward loss function – Specify the loss between the predictions and the training targets.
- Create a backward loss function (optional) – Specify the derivative of the loss with respect to the predictions. If you do not specify a backward loss function, then the forward loss function must support
dlarray
objects.
A classification SSE layer computes the sum of squares error loss for classification problems. SSE is an error measure between two continuous random variables. For predictions Y and training targets T, the SSE loss between Y and T is given by
where N is the number of observations and_K_ is the number of classes.
Classification Output Layer Template
Copy the classification output layer template into a new file in MATLAB. This template outlines the structure of a classification output layer and includes the functions that define the layer behavior.
classdef myClassificationLayer < nnet.layer.ClassificationLayer % ... % & nnet.layer.Acceleratable % (Optional)
properties
% (Optional) Layer properties.
% Layer properties go here.
end
methods
function layer = myClassificationLayer()
% (Optional) Create a myClassificationLayer.
% Layer constructor function goes here.
end
function loss = forwardLoss(layer,Y,T)
% Return the loss between the predictions Y and the training
% targets T.
%
% Inputs:
% layer - Output layer
% Y – Predictions made by network
% T – Training targets
%
% Output:
% loss - Loss between Y and T
% Layer forward loss function goes here.
end
function dLdY = backwardLoss(layer,Y,T)
% (Optional) Backward propagate the derivative of the loss
% function.
%
% Inputs:
% layer - Output layer
% Y – Predictions made by network
% T – Training targets
%
% Output:
% dLdY - Derivative of the loss with respect to the
% predictions Y
% Layer backward loss function goes here.
end
end
end
Name the Layer and Specify Superclasses
First, give the layer a name. In the first line of the class file, replace the existing name myClassificationLayer
withsseClassificationLayer
. Because the layer supports acceleration, also include the nnet.layer.Acceleratable
class. For more information about custom layer acceleration, see Custom Layer Function Acceleration.
classdef sseClassificationLayer < nnet.layer.ClassificationLayer ... & nnet.layer.Acceleratable
...
end
Next, rename the myClassificationLayer
constructor function (the first function in the methods
section) so that it has the same name as the layer.
methods
function layer = sseClassificationLayer()
...
end
...
end
Save the Layer
Save the layer class file in a new file namedsseClassificationLayer.m
. The file name must match the layer name. To use the layer, you must save the file in the current folder or in a folder on the MATLAB path.
Declare Layer Properties
Declare the layer properties in the properties
section.
By default, custom output layers have the following properties:
Name
— Layer name, specified as a character vector or a string scalar. ForLayer
array input, the trainnet anddlnetwork functions automatically assign names to unnamed layers.Description
— One-line description of the layer, specified as a character vector or a string scalar. This description appears when the layer is displayed in aLayer
array. If you do not specify a layer description, then the software displays"Classification Output"
or"Regression Output"
.Type
— Type of the layer, specified as a character vector or a string scalar. The value ofType
appears when the layer is displayed in aLayer
array. If you do not specify a layer type, then the software displays the layer class name.
Custom classification layers also have the following property:
Classes
— Classes of the output layer, specified as a categorical vector, string array, cell array of character vectors, or"auto"
. IfClasses
is"auto"
, then the software automatically sets the classes at training time. If you specify the string array or cell array of character vectorsstr
, then the software sets the classes of the output layer tocategorical(str,str)
.
Custom regression layers also have the following property:
ResponseNames
— Names of the responses, specified a cell array of character vectors or a string array. At training time, the software automatically sets the response names according to the training data. The default is{}
.
If the layer has no other properties, then you can omit the properties
section.
In this example, the layer does not require any additional properties, so you can remove the properties
section.
Create Constructor Function
Create the function that constructs the layer and initializes the layer properties. Specify any variables required to create the layer as inputs to the constructor function.
Specify the input argument name
to assign to theName
property at creation. Add a comment to the top of the function that explains the syntax of the function.
function layer = sseClassificationLayer(name)
% layer = sseClassificationLayer(name) creates a sum of squares
% error classification layer and specifies the layer name.
...
end
Initialize Layer Properties
Replace the comment % Layer constructor function goes here
with code that initializes the layer properties.
Give the layer a one-line description by setting theDescription
property of the layer. Set theName
property to the input argumentname
.
function layer = sseClassificationLayer(name)
% layer = sseClassificationLayer(name) creates a sum of squares
% error classification layer and specifies the layer name.
% Set layer name.
layer.Name = name;
% Set layer description.
layer.Description = 'Sum of squares error';
end
Create Forward Loss Function
Create a function named forwardLoss
that returns the SSE loss between the predictions made by the network and the training targets. The syntax forforwardLoss
is loss = forwardLoss(layer, Y, T)
, where Y
is the output of the previous layer andT
represents the training targets.
For classification problems, the dimensions of T
depend on the type of problem.
Classification Task | Example | |
---|---|---|
Shape | Data Format | |
2-D image classification | 1-by-1-by-K_-by-N, where_K is the number of classes and_N_ is the number of observations | "SSCB" |
3-D image classification | 1-by-1-by-1-by-K_-by-N, where_K is the number of classes and_N_ is the number of observations | "SSSCB" |
Sequence-to-label classification | K_-by-N, where_K is the number of classes and_N_ is the number of observations | "CB" |
Sequence-to-sequence classification | _K_-by-N_-by-S, where K is the number of classes,N is the number of observations, and_S is the sequence length | "CBT" |
The size of Y
depends on the output of the previous layer. To ensure thatY
is the same size as T
, you must include a layer that outputs the correct size before the output layer. For example, to ensure thatY
is a 4-D array of prediction scores for K classes, you can include a fully connected layer of size K followed by a softmax layer before the output layer.
A classification SSE layer computes the sum of squares error loss for classification problems. SSE is an error measure between two continuous random variables. For predictions Y and training targets T, the SSE loss between Y and T is given by
where N is the number of observations and_K_ is the number of classes.
The inputs Y
and T
correspond to_Y_ and T in the equation, respectively. The output loss
corresponds to L. Add a comment to the top of the function that explains the syntaxes of the function.
function loss = forwardLoss(layer, Y, T)
% loss = forwardLoss(layer, Y, T) returns the SSE loss between
% the predictions Y and the training targets T.
% Calculate sum of squares.
sumSquares = sum((Y-T).^2);
% Take mean over mini-batch.
N = size(Y,4);
loss = sum(sumSquares)/N;
end
Because the forwardLoss
function only uses functions that support dlarray
objects, defining the backwardLoss
function is optional. For a list of functions that support dlarray
objects, see List of Functions with dlarray Support.
Completed Layer
View the completed classification output layer class file.
classdef sseClassificationLayer < nnet.layer.ClassificationLayer ... & nnet.layer.Acceleratable % Example custom classification layer with sum of squares error loss.
methods
function layer = sseClassificationLayer(name)
% layer = sseClassificationLayer(name) creates a sum of squares
% error classification layer and specifies the layer name.
% Set layer name.
layer.Name = name;
% Set layer description.
layer.Description = 'Sum of squares error';
end
function loss = forwardLoss(layer, Y, T)
% loss = forwardLoss(layer, Y, T) returns the SSE loss between
% the predictions Y and the training targets T.
% Calculate sum of squares.
sumSquares = sum((Y-T).^2);
% Take mean over mini-batch.
N = size(Y,4);
loss = sum(sumSquares)/N;
end
end
end
GPU Compatibility
If the layer forward functions fully support dlarray
objects, then the layer is GPU compatible. Otherwise, to be GPU compatible, the layer functions must support inputs and return outputs of type gpuArray (Parallel Computing Toolbox).
Many MATLAB built-in functions support gpuArray (Parallel Computing Toolbox) and dlarray
input arguments. For a list of functions that support dlarray
objects, see List of Functions with dlarray Support. For a list of functions that execute on a GPU, see Run MATLAB Functions on a GPU (Parallel Computing Toolbox). To use a GPU for deep learning, you must also have a supported GPU device. For information on supported devices, see GPU Computing Requirements (Parallel Computing Toolbox). For more information on working with GPUs in MATLAB, see GPU Computing in MATLAB (Parallel Computing Toolbox).
The MATLAB functions used in forwardLoss
all supportdlarray
objects, so the layer is GPU compatible.
Check Output Layer Validity
Check the layer validity of the custom classification output layer sseClassificationLayer
.
Create an instance of the layersseClassificationLayer
.
layer = sseClassificationLayer('sse');
Check the layer is valid using checkLayer
. Specify the valid input size to be the size of a single observation of typical input to the layer. The layer expects a 1-by-1-by-K_-by-N array inputs, where_K is the number of classes, and N is the number of observations in the mini-batch.
validInputSize = [1 1 10]; checkLayer(layer,validInputSize,'ObservationDimension',4);
Skipping GPU tests. No compatible GPU device found.
Skipping code generation compatibility tests. To check validity of the layer for code generation, specify the CheckCodegenCompatibility and ObservationDimension options.
Running nnet.checklayer.TestOutputLayerWithoutBackward ........ Done nnet.checklayer.TestOutputLayerWithoutBackward
Test Summary: 8 Passed, 0 Failed, 0 Incomplete, 2 Skipped. Time elapsed: 0.57643 seconds.
The test summary reports the number of passed, failed, incomplete, and skipped tests.
Include Custom Classification Output Layer in Network
Include a custom classification output layer in a network.
You can use a custom output layer in the same way as any other output layer in Deep Learning Toolbox. This section shows how to create and train a network for classification using the custom classification output layer that you created earlier.
Load the example training data.
[XTrain,YTrain] = digitTrain4DArrayData;
Create a layer array including the custom classification output layersseClassificationLayer
, attached to this example as a supporting file.
layers = [ imageInputLayer([28 28 1]) convolution2dLayer(5,20) batchNormalizationLayer reluLayer fullyConnectedLayer(10) softmaxLayer sseClassificationLayer('sse')]
layers = 7x1 Layer array with layers:
1 '' Image Input 28x28x1 images with 'zerocenter' normalization
2 '' 2-D Convolution 20 5x5 convolutions with stride [1 1] and padding [0 0 0 0]
3 '' Batch Normalization Batch normalization
4 '' ReLU ReLU
5 '' Fully Connected 10 fully connected layer
6 '' Softmax softmax
7 'sse' Classification Output Sum of squares error
Set the training options and train the network.
options = trainingOptions('sgdm'); net = trainNetwork(XTrain,YTrain,layers,options);
Training on single CPU. Initializing input data normalization. |========================================================================================| | Epoch | Iteration | Time Elapsed | Mini-batch | Mini-batch | Base Learning | | | | (hh:mm:ss) | Accuracy | Loss | Rate | |========================================================================================| | 1 | 1 | 00:00:00 | 9.38% | 0.9944 | 0.0100 | | 2 | 50 | 00:00:04 | 75.00% | 0.3541 | 0.0100 | | 3 | 100 | 00:00:08 | 92.97% | 0.1288 | 0.0100 | | 4 | 150 | 00:00:12 | 96.09% | 0.0970 | 0.0100 | | 6 | 200 | 00:00:16 | 95.31% | 0.0753 | 0.0100 | | 7 | 250 | 00:00:21 | 97.66% | 0.0447 | 0.0100 | | 8 | 300 | 00:00:25 | 99.22% | 0.0211 | 0.0100 | | 9 | 350 | 00:00:30 | 99.22% | 0.0261 | 0.0100 | | 11 | 400 | 00:00:34 | 100.00% | 0.0071 | 0.0100 | | 12 | 450 | 00:00:38 | 100.00% | 0.0054 | 0.0100 | | 13 | 500 | 00:00:43 | 100.00% | 0.0092 | 0.0100 | | 15 | 550 | 00:00:47 | 100.00% | 0.0061 | 0.0100 | | 16 | 600 | 00:00:52 | 100.00% | 0.0019 | 0.0100 | | 17 | 650 | 00:00:56 | 100.00% | 0.0039 | 0.0100 | | 18 | 700 | 00:01:00 | 100.00% | 0.0023 | 0.0100 | | 20 | 750 | 00:01:05 | 100.00% | 0.0023 | 0.0100 | | 21 | 800 | 00:01:10 | 100.00% | 0.0019 | 0.0100 | | 22 | 850 | 00:01:14 | 100.00% | 0.0017 | 0.0100 | | 24 | 900 | 00:01:18 | 100.00% | 0.0020 | 0.0100 | | 25 | 950 | 00:01:23 | 100.00% | 0.0012 | 0.0100 | | 26 | 1000 | 00:01:28 | 100.00% | 0.0011 | 0.0100 | | 27 | 1050 | 00:01:33 | 99.22% | 0.0103 | 0.0100 | | 29 | 1100 | 00:01:38 | 100.00% | 0.0013 | 0.0100 | | 30 | 1150 | 00:01:43 | 100.00% | 0.0011 | 0.0100 | | 30 | 1170 | 00:01:45 | 99.22% | 0.0070 | 0.0100 | |========================================================================================| Training finished: Max epochs completed.
Evaluate the network performance by making predictions on new data and calculating the accuracy.
[XTest,YTest] = digitTest4DArrayData; YPred = classify(net, XTest); accuracy = mean(YTest == YPred)
See Also
trainnet | trainingOptions | dlnetwork | checkLayer | findPlaceholderLayers | replaceLayer | PlaceholderLayer