Define Custom Deep Learning Layer with Learnable Parameters - MATLAB & Simulink (original) (raw)

If Deep Learning Toolbox™ does not provide the layer you require for your task, then you can define your own custom layer using this example as a guide. For a list of built-in layers, see List of Deep Learning Layers.

To define a custom deep learning layer, you can use the template provided in this example, which takes you through these steps:

  1. Name the layer — Give the layer a name so that you can use it in MATLAB®.
  2. Declare the layer properties — Specify the properties of the layer, including learnable parameters and state parameters.
  3. Create the constructor function (optional) — Specify how to construct the layer and initialize its properties. If you do not specify a constructor function, then at creation, the software initializes theName, Description, andType properties with [] and sets the number of layer inputs and outputs to 1.
  4. Create initialize function (optional) — Specify how to initialize the learnable and state parameters when the software initializes the network. If you do not specify an initialize function, then the software does not initialize parameters when it initializes the network.
  5. Create forward functions — Specify how data passes forward through the layer (forward propagation) at prediction time and at training time.
  6. Create reset state function (optional) — Specify how to reset state parameters.
  7. Create a backward function (optional) — Specify the derivatives of the loss with respect to the input data and the learnable parameters (backward propagation). If you do not specify a backward function, then the forward functions must support dlarray objects.

When you define the layer functions, you can use dlarray objects.Using dlarray objects makes working with high dimensional data easier by allowing you to label the dimensions. For example, you can label which dimensions correspond to spatial, time, channel, and batch dimensions using the"S", "T", "C", and"B" labels, respectively. For unspecified and other dimensions, use the"U" label. For dlarray object functions that operate over particular dimensions, you can specify the dimension labels by formatting thedlarray object directly, or by using the DataFormat option.

Using formatted dlarray objects in custom layers also allows you to define layers where the inputs and outputs have different formats, such as layers that permute, add, or remove dimensions. For example, you can define a layer that takes as input a mini-batch of images with the format "SSCB" (spatial, spatial, channel, batch) and output a mini-batch of sequences with the format "CBT" (channel, batch, time). Using formatted dlarray objects also allows you to define layers that can operate on data with different input formats, for example, layers that support inputs with the formats "SSCB" (spatial, spatial, channel, batch) and "CBT" (channel, batch, time).

dlarray objects also enable support for automatic differentiation. Consequently, if your forward functions fully support dlarray objects, then defining the backward function is optional.

To enable support for using formatted dlarray objects in custom layer forward functions, also inherit from the nnet.layer.Formattable class when defining the custom layer. For an example, see Define Custom Deep Learning Layer with Formatted Inputs.

This example shows how to create a SReLU layer, which is a layer with four learnable parameters and use it in a neural network. A SReLU layer performs a thresholding operation, where for each channel, the layer scales values outside an interval. The interval thresholds and scaling factors are learnable parameters. [1].

The SReLU operation is given by

where xi is the input on channel i,tli and_tri_ are the left and right thresholds on channel i, respectively, and_ali_ and_ari_ are the left and right scaling factors on channel i, respectively. These threshold values and scaling factors are learnable parameter, which the layer learns during training.

Custom Layer Template

Copy the custom layer template into a new file in MATLAB. This template gives the structure of a layer class definition. It outlines:

classdef myLayer < nnet.layer.Layer % ... % & nnet.layer.Formattable ... % (Optional) % & nnet.layer.Acceleratable % (Optional)

properties
    % (Optional) Layer properties.

    % Declare layer properties here.
end

properties (Learnable)
    % (Optional) Layer learnable parameters.

    % Declare learnable parameters here.
end

properties (State)
    % (Optional) Layer state parameters.

    % Declare state parameters here.
end

properties (Learnable, State)
    % (Optional) Nested dlnetwork objects with both learnable
    % parameters and state parameters.

    % Declare nested networks with learnable and state parameters here.
end

methods
    function layer = myLayer()
        % (Optional) Create a myLayer.
        % This function must have the same name as the class.

        % Define layer constructor function here.
    end

    function layer = initialize(layer,layout)
        % (Optional) Initialize layer learnable and state parameters.
        %
        % Inputs:
        %         layer  - Layer to initialize
        %         layout - Data layout, specified as a networkDataLayout
        %                  object
        %
        % Outputs:
        %         layer - Initialized layer
        %
        %  - For layers with multiple inputs, replace layout with 
        %    layout1,...,layoutN, where N is the number of inputs.
        
        % Define layer initialization function here.
    end
    

    function [Y,state] = predict(layer,X)
        % Forward input data through the layer at prediction time and
        % output the result and updated state.
        %
        % Inputs:
        %         layer - Layer to forward propagate through 
        %         X     - Input data
        % Outputs:
        %         Y     - Output of layer forward function
        %         state - (Optional) Updated layer state
        %
        %  - For layers with multiple inputs, replace X with X1,...,XN, 
        %    where N is the number of inputs.
        %  - For layers with multiple outputs, replace Y with 
        %    Y1,...,YM, where M is the number of outputs.
        %  - For layers with multiple state parameters, replace state 
        %    with state1,...,stateK, where K is the number of state 
        %    parameters.

        % Define layer predict function here.
    end

    function [Y,state,memory] = forward(layer,X)
        % (Optional) Forward input data through the layer at training
        % time and output the result, the updated state, and a memory
        % value.
        %
        % Inputs:
        %         layer - Layer to forward propagate through 
        %         X     - Layer input data
        % Outputs:
        %         Y      - Output of layer forward function 
        %         state  - (Optional) Updated layer state 
        %         memory - (Optional) Memory value for custom backward
        %                  function
        %
        %  - For layers with multiple inputs, replace X with X1,...,XN, 
        %    where N is the number of inputs.
        %  - For layers with multiple outputs, replace Y with 
        %    Y1,...,YM, where M is the number of outputs.
        %  - For layers with multiple state parameters, replace state 
        %    with state1,...,stateK, where K is the number of state 
        %    parameters.

        % Define layer forward function here.
    end

    function layer = resetState(layer)
        % (Optional) Reset layer state.

        % Define reset state function here.
    end

    function [dLdX,dLdW,dLdSin] = backward(layer,X,Y,dLdY,dLdSout,memory)
        % (Optional) Backward propagate the derivative of the loss
        % function through the layer.
        %
        % Inputs:
        %         layer   - Layer to backward propagate through 
        %         X       - Layer input data 
        %         Y       - Layer output data 
        %         dLdY    - Derivative of loss with respect to layer 
        %                   output
        %         dLdSout - (Optional) Derivative of loss with respect 
        %                   to state output
        %         memory  - Memory value from forward function
        % Outputs:
        %         dLdX   - Derivative of loss with respect to layer input
        %         dLdW   - (Optional) Derivative of loss with respect to
        %                  learnable parameter 
        %         dLdSin - (Optional) Derivative of loss with respect to 
        %                  state input
        %
        %  - For layers with state parameters, the backward syntax must
        %    include both dLdSout and dLdSin, or neither.
        %  - For layers with multiple inputs, replace X and dLdX with
        %    X1,...,XN and dLdX1,...,dLdXN, respectively, where N is
        %    the number of inputs.
        %  - For layers with multiple outputs, replace Y and dLdY with
        %    Y1,...,YM and dLdY,...,dLdYM, respectively, where M is the
        %    number of outputs.
        %  - For layers with multiple learnable parameters, replace 
        %    dLdW with dLdW1,...,dLdWP, where P is the number of 
        %    learnable parameters.
        %  - For layers with multiple state parameters, replace dLdSin
        %    and dLdSout with dLdSin1,...,dLdSinK and 
        %    dLdSout1,...,dldSoutK, respectively, where K is the number
        %    of state parameters.

        % Define layer backward function here.
    end
end

end

Name Layer and Specify Superclasses

First, give the layer a name. In the first line of the class file, replace the existing name myLayer with sreluLayer.

classdef sreluLayer < nnet.layer.Layer % ... % & nnet.layer.Formattable ... % (Optional) % & nnet.layer.Acceleratable % (Optional) ... end

If you do not specify a backward function, then the layer functions, by default, receive_unformatted_ dlarray objects as input. To specify that the layer receives_formatted_ dlarray objects as input and also outputs formatteddlarray objects, also inherit from thennet.layer.Formattable class when defining the custom layer.

The layer functions support acceleration, so also inherit from nnet.layer.Acceleratable. For more information about accelerating custom layer functions, see Custom Layer Function Acceleration. The layer does not require formattable inputs, so remove the optional nnet.layer.Formattable superclass.

classdef sreluLayer < nnet.layer.Layer ... & nnet.layer.Acceleratable ... end

Next, rename the myLayer constructor function (the first function in the methods section) so that it has the same name as the layer.

methods
    function layer = sreluLayer()           
        ...
    end

    ...
 end

Save the Layer

Save the layer class file in a new file named sreluLayer.m. The file name must match the layer name. To use the layer, you must save the file in the current folder or in a folder on the MATLAB path.

Declare Properties and Learnable Parameters

Declare the layer properties in the properties section and declare learnable parameters by listing them in the properties (Learnable) section.

By default, custom layers have these properties. Do not declare these properties in theproperties section.

Property Description
Name Layer name, specified as a character vector or string scalar. For Layer array input, the trainnet anddlnetwork functions automatically assign names to layers with the name "".
Description One-line description of the layer, specified as a string scalar or a character vector. This description appears when the layer is displayed in a Layer array.If you do not specify a layer description, then the software displays the layer class name.
Type Type of the layer, specified as a character vector or a string scalar. The value of Type appears when the layer is displayed in a Layer array.If you do not specify a layer type, then the software displays the layer class name.
NumInputs Number of inputs of the layer, specified as a positive integer. If you do not specify this value, then the software automatically setsNumInputs to the number of names inInputNames. The default value is 1.
InputNames Input names of the layer, specified as a cell array of character vectors. If you do not specify this value andNumInputs is greater than 1, then the software automatically sets InputNames to{'in1',...,'inN'}, where N is equal to NumInputs. The default value is{'in'}.
NumOutputs Number of outputs of the layer, specified as a positive integer. If you do not specify this value, then the software automatically setsNumOutputs to the number of names inOutputNames. The default value is 1.
OutputNames Output names of the layer, specified as a cell array of character vectors. If you do not specify this value andNumOutputs is greater than 1, then the software automatically sets OutputNames to{'out1',...,'outM'}, where M is equal to NumOutputs. The default value is{'out'}.

If the layer has no other properties, then you can omit the properties section.

Tip

If you are creating a layer with multiple inputs, then you must set either the NumInputs or InputNames properties in the layer constructor. If you are creating a layer with multiple outputs, then you must set either the NumOutputs or OutputNames properties in the layer constructor. For an example, see Define Custom Deep Learning Layer with Multiple Inputs.

A SReLU layer does not require any additional properties, so you can remove theproperties section.

A SReLU layer has four learnable parameters: the left and right scaling and threshold factors, respectively. Declare these learnable parameters in the properties (Learnable) section and name them LeftSlope,RightSlope, LeftThreshold, andRightThreshold, respectively.

properties (Learnable)
% Layer learnable parameters

    LeftSlope
    RightSlope
    LeftThreshold
    RightThreshold
end

Create Constructor Function

Create the function that constructs the layer and initializes the layer properties. Specify any variables required to create the layer as inputs to the constructor function.

The SReLU layer constructor function requires one optional argument (the layer name). Specify one input argument named args in thesreluLayer function that corresponds to the optional name-value argument. Add a comment to the top of the function that explains the syntax of the function.

    function layer = sreluLayer(args)
        % layer = sreluLayercreates a SReLU layer.
        %
        % layer = sreluLayer(Name=name) also specifies the
        % layer name

        ...
    end

Initialize Layer Properties

Initialize the layer properties in the constructor function. Replace the comment % Layer constructor function goes here with code that initializes the layer properties. Do not initialize learnable or state parameters in the constructor function, initialize them in the initialize function instead.

Parse the input arguments using an arguments block and set theName property.

        arguments
            args.Name = "";
        end

        % Set layer name.
        layer.Name = args.Name;

Give the layer a one-line description by setting theDescription property of the layer. Set the description to describe the type of layer.

        % Set layer description.
        layer.Description = "SReLU";

View the completed constructor function.

    function layer = sreluLayer(args) 
        % layer = sreluLayer creates a SReLU layer.
        %
        % layer = sreluLayer(Name=name) also specifies the
        % layer name.

        arguments
            args.Name = "";
        end

        % Set layer name.
        layer.Name = args.Name;

        % Set layer description.
        layer.Description = "SReLU";
    end

With this constructor function, the commandsreluLayer(Name="srelu") creates a SReLU layer with the name"srelu".

Create Initialize Function

Create the function that initializes the layer learnable and state parameters when the software initializes the network. Ensure that the function only initializes learnable and state parameters when the property is empty, otherwise the software can overwrite when you load the network from a MAT file.

To initialize the learnable parameters, generate a random vectors with the same number of channels as the input data.

Because the size of the input data is unknown until the network is ready to use, you must create an initialize function that initializes the learnable and state parameters using networkDataLayout objects that the software provides to the function. Network data layout objects contain information about the sizes and formats of expected input data. Create an initialize function that uses the size and format information to initialize learnable and state parameters such that they have the correct size.

The learnable parameters have the same number of dimensions as the input observations, where the channel dimension has the same size as the channel dimension of the input data, and the remaining dimensions are singleton. Create aninitialize function that extracts the size and format information from the input networkDataLayout object and initializes the learnable parameters with the same number of channels.

    function layer = initialize(layer,layout)
        % layer = initialize(layer,layout) initializes the layer
        % learnable parameters using the specified input layout.

        % Find number of channels.
        idx = finddim(layout,"C");
        numChannels = layout.Size(idx);

        % Initialize empty learnable parameters.
        sz = ones(1,numel(layout.Size);
        sz(idx) = numChannels;
        
        if isempty(layer.LeftSlope)
            layer.LeftSlope = rand(sz);
        end
        
        if isempty(layer.RightSlope)
            layer.RightSlope = rand(sz);
        end
        
        if isempty(layer.LeftThreshold)
            layer.LeftThreshold = rand(sz);
        end
        
        if isempty(layer.RightThreshold)
            layer.RightThreshold = rand(sz);
        end
    end

Create Forward Functions

Create the layer forward functions to use at prediction time and training time.

Create a function named predict that propagates the data forward through the layer at prediction time and outputs the result.

The predict function syntax depends on the type of layer.

You can adjust the syntaxes for layers with multiple inputs, multiple outputs, or multiple state parameters:

Tip

If the number of inputs to the layer can vary, then use varargin instead of X1,…,XN. In this case, varargin is a cell array of the inputs, where varargin{i} corresponds to Xi.

If the number of outputs can vary, then use varargout instead of Y1,…,YM. In this case, varargout is a cell array of the outputs, where varargout{j} corresponds to Yj.

Tip

If the custom layer has a dlnetwork object for a learnable parameter, then in the predict function of the custom layer, use thepredict function for the dlnetwork. When you do so, the dlnetwork object predict function uses the appropriate layer operations for prediction. If the dlnetwork has state parameters, then also return the network state.

Because a SReLU layer has only one input and one output, the syntax forpredict for a SReLU layer is Y = predict(layer,X).

By default, the layer uses predict as the forward function at training time. To use a different forward function at training time, or retain a value required for a custom backward function, you must also create a function namedforward.

The dimensions of the inputs depend on the type of data and the output of the connected layers:

Layer Input Example
Shape Data Format
2-D images _h_-by-_w_-by-_c_-by-N numeric array, where h, w,c and N are the height, width, number of channels of the images, and number of observations, respectively. "SSCB"
3-D images _h_-by-_w_-by-_d_-by-_c_-by-N numeric array, where h, w,d, c and N are the height, width, depth, number of channels of the images, and number of image observations, respectively. "SSSCB"
Vector sequences _c_-by-N_-by-s matrix, where c is the number of features of the sequence, N is the number of sequence observations, and_s is the sequence length. "CBT"
2-D image sequences _h_-by-_w_-by-_c_-by-N_-by-s array, where h, w, and_c correspond to the height, width, and number of channels of the image, respectively, N is the number of image sequence observations, and s is the sequence length. "SSCBT"
3-D image sequences _h_-by-_w_-by-_d_-by-_c_-by-N_-by-s array, where h, w,d, and c correspond to the height, width, depth, and number of channels of the image, respectively,N is the number of image sequence observations, and_s is the sequence length. "SSSCBT"
Features c_-by-N array, where_c is the number of features, and N is the number of observations. "CB"

For layers that output sequences, the layers can output sequences of any length or output data with no time dimension.

The forward function propagates the data forward through the layer at training time and also outputs a memory value.

The forward function syntax depends on the type of layer:

You can adjust the syntaxes for layers with multiple inputs, multiple outputs, or multiple state parameters:

Tip

If the number of inputs to the layer can vary, then use varargin instead of X1,…,XN. In this case, varargin is a cell array of the inputs, where varargin{i} corresponds to Xi.

If the number of outputs can vary, then use varargout instead of Y1,…,YM. In this case, varargout is a cell array of the outputs, where varargout{j} corresponds to Yj.

Tip

If the custom layer has a dlnetwork object for a learnable parameter, then in the forward function of the custom layer, use theforward function of the dlnetwork object. When you do so, the dlnetwork object forward function uses the appropriate layer operations for training.

The SReLU operation is given by

where xi is the input on channel i,tli and_tri_ are the left and right thresholds on channel i, respectively, and_ali_ and_ari_ are the left and right scaling factors on channel i, respectively. These threshold values and scaling factors are learnable parameter, which the layer learns during training.

Implement this operation in predict. The SReLU layer does not require memory or a different forward function for training, so you can remove theforward function from the class file. Add a comment to the top of the function that explains the syntaxes of the function.

Tip

If you preallocate arrays using functions such aszeros, then you must ensure that the data types of these arrays are consistent with the layer function inputs. To create an array of zeros of the same data type as another array, use the "like" option of zeros. For example, to initialize an array of zeros of size sz with the same data type as the array X, use Y = zeros(sz,"like",X).

    function Y = predict(layer, X)
        % Y = predict(layer, X) forwards the input data X through the
        % layer and outputs the result Y.
        
        tl = layer.LeftThreshold;
        al = layer.LeftSlope;
        tr = layer.RightThreshold;
        ar = layer.RightSlope;
        
        Y = (X <= tl) .* (tl + al.*(X-tl)) ...
            + ((tl < X) & (X < tr)) .* X ...
            + (tr <= X) .* (tr + ar.*(X-tr));
    end

Because the predict function uses only functions that supportdlarray objects, defining the backward function is optional. For a list of functions that support dlarray objects, see List of Functions with dlarray Support.

Completed Layer

View the completed layer class file.

classdef sreluLayer < nnet.layer.Layer ... & nnet.layer.Acceleratable % Example custom SReLU layer.

properties (Learnable)
% Layer learnable parameters

    LeftSlope
    RightSlope
    LeftThreshold
    RightThreshold
end

methods
    function layer = sreluLayer(args) 
        % layer = sreluLayer creates a SReLU layer.
        %
        % layer = sreluLayer(Name=name) also specifies the
        % layer name.

        arguments
            args.Name = "";
        end

        % Set layer name.
        layer.Name = args.Name;

        % Set layer description.
        layer.Description = "SReLU";
    end

    function layer = initialize(layer,layout)
        % layer = initialize(layer,layout) initializes the layer
        % learnable parameters using the specified input layout.

        % Find number of channels.
        idx = finddim(layout,"C");
        numChannels = layout.Size(idx);

        % Initialize empty learnable parameters.
        sz = ones(1,numel(layout.Size);
        sz(idx) = numChannels;
        
        if isempty(layer.LeftSlope)
            layer.LeftSlope = rand(sz);
        end
        
        if isempty(layer.RightSlope)
            layer.RightSlope = rand(sz);
        end
        
        if isempty(layer.LeftThreshold)
            layer.LeftThreshold = rand(sz);
        end
        
        if isempty(layer.RightThreshold)
            layer.RightThreshold = rand(sz);
        end
    end

    function Y = predict(layer, X)
        % Y = predict(layer, X) forwards the input data X through the
        % layer and outputs the result Y.
        
        tl = layer.LeftThreshold;
        al = layer.LeftSlope;
        tr = layer.RightThreshold;
        ar = layer.RightSlope;
        
        Y = (X <= tl) .* (tl + al.*(X-tl)) ...
            + ((tl < X) & (X < tr)) .* X ...
            + (tr <= X) .* (tr + ar.*(X-tr));
    end
end

end

GPU Compatibility

If the layer forward functions fully support dlarray objects, then the layer is GPU compatible. Otherwise, to be GPU compatible, the layer functions must support inputs and return outputs of type gpuArray (Parallel Computing Toolbox).

Many MATLAB built-in functions support gpuArray (Parallel Computing Toolbox) and dlarray input arguments. For a list of functions that support dlarray objects, see List of Functions with dlarray Support. For a list of functions that execute on a GPU, see Run MATLAB Functions on a GPU (Parallel Computing Toolbox). To use a GPU for deep learning, you must also have a supported GPU device. For information on supported devices, seeGPU Computing Requirements (Parallel Computing Toolbox). For more information on working with GPUs in MATLAB, see GPU Computing in MATLAB (Parallel Computing Toolbox).

In this example, the MATLAB functions used in predict all supportdlarray objects, so the layer is GPU compatible.

Check Validity of Custom Layer Using checkLayer

Check the layer validity of the custom layer sreluLayer.

The custom layer sreluLayer, attached to this example as a supporting file, applies the SReLU operation to the input data. To access this layer, open this example as a live script.

Create an instance of the layer.

Create a networkDataFormat object that specifies the expected input size and format of typical input to the layer. Specify a valid input size of [24 24 20 128], where the dimensions correspond to the height, width, number of channels, and number of observations of the previous layer output. Specify the format as "SSCB" (spatial, spatial, channel, batch).

validInputSize = [24 24 20 128]; layout = networkDataLayout(validInputSize,"SSCB");

Check the layer validity using checkLayer.

Skipping GPU tests. No compatible GPU device found.

Skipping code generation compatibility tests. To check validity of the layer for code generation, specify the CheckCodegenCompatibility and ObservationDimension options.

Running nnet.checklayer.TestLayerWithoutBackward .......... .......... Done nnet.checklayer.TestLayerWithoutBackward


Test Summary: 20 Passed, 0 Failed, 0 Incomplete, 14 Skipped. Time elapsed: 0.22584 seconds.

The function does not detect any issues with the layer.

Include Custom Layer in Network

You can use a custom layer in the same way as any other layer in Deep Learning Toolbox. This section shows how to create and train a network for digit classification using the SReLU layer you created earlier.

Load the example training data.

Create a layer array containing the custom layer sreluLayer, attached to this example as a supporting file. To access this layer, open this example as a live script.

layers = [ imageInputLayer([28 28 1]) convolution2dLayer(5,20) batchNormalizationLayer sreluLayer fullyConnectedLayer(10) softmaxLayer];

Set the training options and train the neural network using the trainnet function. For classification, use cross-entropy loss. By default, the trainnet function uses a GPU if one is available. Training on a GPU requires a Parallel Computing Toolbox™ license and a supported GPU device. For information on supported devices, see GPU Computing Requirements (Parallel Computing Toolbox). Otherwise, the trainnet function uses the CPU. To specify the execution environment, use the ExecutionEnvironment training option.

options = trainingOptions("adam",MaxEpochs=10,Metrics="accuracy"); net = trainnet(XTrain,labelsTrain,layers,"crossentropy",options);

Iteration    Epoch    TimeElapsed    LearnRate    TrainingLoss    TrainingAccuracy
_________    _____    ___________    _________    ____________    ________________
        1        1       00:00:01        0.001          2.6767              10.156
       50        2       00:00:05        0.001         0.68513              74.219
      100        3       00:00:08        0.001         0.46812              86.719
      150        4       00:00:11        0.001         0.24365              91.406
      200        6       00:00:14        0.001        0.095949              99.219
      250        7       00:00:17        0.001         0.04571                 100
      300        8       00:00:20        0.001        0.050645                 100
      350        9       00:00:23        0.001         0.03325                 100
      390       10       00:00:25        0.001        0.032926                 100

Training stopped: Max epochs completed

Load the test data.

Test the neural network using the testnet function. For single-label classification, evaluate the accuracy. By default, the testnet function uses a GPU if one is available. To select the execution environment manually, use the ExecutionEnvironment argument of the testnet function.

accuracy = testnet(net,XTest,labelsTest,"accuracy")

See Also

trainnet | trainingOptions | dlnetwork | functionLayer | checkLayer | setLearnRateFactor | setL2Factor | getLearnRateFactor | getL2Factor | findPlaceholderLayers | replaceLayer | PlaceholderLayer | networkDataLayout