resnetNetwork - 2-D residual neural network - MATLAB (original) (raw)

2-D residual neural network

Since R2024a

Syntax

Description

[net](#mw%5Ff2282aaf-5f8d-4c37-a8cf-3e5c99e7789c%5Fsep%5Fmw%5F4f22a70f-f451-42ef-b976-4468baf1d0e9) = resnetNetwork([inputSize](#mw%5F2b840fec-3448-46a9-8af2-ee3ce66a81cb),[numClasses](#mw%5Ff2282aaf-5f8d-4c37-a8cf-3e5c99e7789c%5Fsep%5Fmw%5F8f913620-3ee1-42c5-8fe1-4aa8ade2df47)) creates a 2-D residual neural network with the specified image input size and number of classes.

To create a 3-D residual network, use resnet3dNetwork.

example

[net](#mw%5Ff2282aaf-5f8d-4c37-a8cf-3e5c99e7789c%5Fsep%5Fmw%5F4f22a70f-f451-42ef-b976-4468baf1d0e9) = resnetNetwork([inputSize](#mw%5F2b840fec-3448-46a9-8af2-ee3ce66a81cb),[numClasses](#mw%5Ff2282aaf-5f8d-4c37-a8cf-3e5c99e7789c%5Fsep%5Fmw%5F8f913620-3ee1-42c5-8fe1-4aa8ade2df47),[Name=Value](#namevaluepairarguments)) specifies additional options using one or more name-value arguments. For example,BottleneckType="none" returns a 2-D residual neural network without bottleneck components.

example

Examples

collapse all

Create Residual Network

Create a residual network with a bottleneck architecture.

imageSize = [224 224 3]; numClasses = 10;

net = resnetNetwork(imageSize,numClasses)

net = dlnetwork with properties:

     Layers: [176x1 nnet.cnn.layer.Layer]
Connections: [191x2 table]
 Learnables: [214x3 table]
      State: [106x3 table]
 InputNames: {'input'}
OutputNames: {'softmax'}
Initialized: 1

View summary with summary.

Analyze the network using the analyzeNetwork function. Note that this network is equivalent to a ResNet-50 residual neural network.

Residual Network with Custom Stack Depth

Create a ResNet-101 network using a custom stack depth.

imageSize = [224 224 3]; numClasses = 10;

stackDepth = [3 4 23 3]; numFilters = [64 128 256 512];

net = resnetNetwork(imageSize,numClasses, ... StackDepth=stackDepth, ... NumFilters=numFilters)

net = dlnetwork with properties:

     Layers: [346x1 nnet.cnn.layer.Layer]
Connections: [378x2 table]
 Learnables: [418x3 table]
      State: [208x3 table]
 InputNames: {'input'}
OutputNames: {'softmax'}
Initialized: 1

View summary with summary.

Analyze the network.

Input Arguments

collapse all

inputSize — Network image input size

vector of positive integers

Network image input size, specified as one of these values:

The values of inputSize depend on theInitialPoolingLayer argument:

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

numClasses — Number of classes for classification tasks

positive integer

Number of classes for classification tasks, specified as a positive integer.

The function returns a neural network for classification tasks with the specified number of classes by setting the output size of the last fully connected layer to numClasses.

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

Name-Value Arguments

Specify optional pairs of arguments asName1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Example: net = resnetNetwork(inputSize,numClasses,BottleneckType="none") returns a 2-D residual neural network without bottleneck components.

Initial Layers

collapse all

InitialFilterSize — Filter size in first convolutional layer

7 (default) | positive integer | vector of positive integers

Filter size in the first convolutional layer, specified as one of these values:

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

InitialNumFilters — Number of filters in first convolutional layer

64 (default) | positive integer

Number of filters in the first convolutional layer, specified as a positive integer. The number of initial filters determines the number of channels (feature maps) in the output of the first convolutional layer in the residual network.

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

InitialStride — Stride in first convolutional layer

2 (default) | positive integer | vector of positive integers

Stride in the first convolutional layer, specified as one of these values:

The stride defines the step size for traversing the input vertically and horizontally.

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

InitialPoolingLayer — First pooling layer

"max" (default) | "average" | "none"

First pooling layer before the initial residual block, specified as one of these values:

Network Architecture

collapse all

ResidualBlockType — Residual block type

"batchnorm-before-add" (default) | "batchnorm-after-add"

Residual block type, specified as one of these values:

The ResidualBlockType argument specifies the location of the batch normalization layer in the standard and downsampling residual blocks. For more information, see Residual Network.

BottleneckType — Block bottleneck type

"downsample-first-conv" (default) | "none"

Block bottleneck type, specified as one of these values:

A bottleneck block reduces the number of channels by a factor of four by performing a convolution with filters of size 1 before performing convolution with filters of size 3. Networks with and without bottleneck blocks have a similar level of computational complexity, but the total number of features propagating in the residual connections is four times larger when you use bottleneck units. Therefore, using a bottleneck increases the efficiency of the network [1].

For more information on the layers in each residual block, see Residual Network.

StackDepth — Number of residual blocks in each stack

[3 4 6 3] (default) | vector of positive integers

Number of residual blocks in each stack, specified as a vector of positive integers.

For example, if the stack depth is [3 4 6 3], the network has four stacks, with three blocks, four blocks, six blocks, and three blocks.

Specify the number of filters in the convolutional layers of each stack using the NumFilters argument. StackDepth must have the same number of elements as NumFilters.

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

NumFilters — Number of filters in convolutional layers of each stack

[64 128 256 512] (default) | vector of positive integers

Number of filters in the convolutional layers of each stack, specified as a vector of positive integers.

NumFilters must have the same number of elements asStackDepth.

The NumFilters value determines the layers on the residual connection in the initial residual block. The residual connection has a convolutional layer when you meet one of these conditions:

For more information about the layers in each residual block, see Residual Network.

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

Normalization — Data normalization to apply

"zerocenter" (default) | "zscore"

Data normalization to apply every time data forward-propagates through the input layer, specified as one of these options:

The trainnet function automatically calculates the mean and standard deviation of the training data.

Initialize — Flag to initialize learnable parameters

true or 1 (default) | false or 0

Flag to initialize learnable parameters, specified as a logical 1 (true) or 0 (false).

Output Arguments

collapse all

net — Residual neural network

dlnetwork object

Residual neural network, returned as a dlnetwork object.

More About

collapse all

Residual Network

Residual networks (ResNets) are a type of deep network that consists of building blocks that have residual connections (also known as_skip_ or shortcut connections). These connections allow the input to skip the convolutional units of the main branch, thus providing a simpler path through the network. By allowing the parameter gradients to flow more easily from the final layers to the earlier layers of the network, residual connections mitigate the problem of vanishing gradients during early training.

The structure of a residual network is flexible. The key component is the inclusion of the residual connections within residual blocks. A group of residual blocks is called a stack. A ResNet architecture consists of initial layers, followed by stacks containing residual blocks, and then the final layers. A network has three types of residual blocks:

A typical stack has a downsampling residual block, followed bym standard residual blocks, where m is a positive integer. The first stack is the only stack that begins with an initial residual block.

Diagram showing N stacks connected in series.

The initial, standard, and downsampling residual blocks can be_bottleneck_ or nonbottleneck blocks.

A bottleneck block reduces the number of channels by a factor of four by performing a convolution with filters of size 1 before performing convolution with filters of size 3. Networks with and without bottleneck blocks have a similar level of computational complexity, but the total number of features propagating in the residual connections is four times larger when you use bottleneck units. Therefore, using a bottleneck increases the efficiency of the network [1].

The options you set determine the layers inside each block.

Block Layers

Name Initial Layers Initial Residual Block Standard Residual Block (BottleneckType="downsample-first-conv") Standard Residual Block (BottleneckType="none") Downsampling Residual Block Final Layers
Description A residual network starts with these layers, in order: imageInputLayer (or image3dInputLayer for 3-D networks)convolution2dLayer (or convolution3dLayer for 3-D networks)batchNormalizationLayerreluLayer(Optional) maxPooling2dLayer or averagePooling2dLayer (or maxPooling3dLayer or averagePooling3dLayer for 3-D networks)Set the optional pooling layer using theInitialPoolingLayer option. The main branch of the initial residual block has the same layers as a standard residual block. TheInitialNumFilters andNumFilters values determine the layers on the residual connection. The residual connection has a convolutional layer with filters and a stride of size 1 when you meet one of these conditions: BottleneckType is"downsample-first-conv" andInitialNumFilters is not equal to four times the first element ofNumFilters.BottleneckType is"none" andInitialNumFilters is not equal to the first element ofNumFilters.If ResidualBlockType is "batchnorm-before-add", then the residual connection also has a batch normalization layer. The standard residual block with bottleneck units has these layers, in order: convolution2dLayer (or convolution3dLayer for 3-D networks) with filters and a stride of size1batchNormalizationLayerreluLayerconvolution2dLayer (or convolution3dLayer for 3-D networks) with filters of size 3 and a stride of size 1batchNormalizationLayerreluLayerconvolution2dLayer (or convolution3dLayer for 3-D networks) with filters and a stride of size1batchNormalizationLayeradditionLayerreluLayerThe standard block has a residual connection from the output of the previous block to the addition layer.Set the position of the addition layer using theResidualBlockType argument. The standard residual block without bottleneck units has these layers, in order: convolution2dLayer (or convolution3dLayer for 3-D networks) with filters of size 3 and a stride of size1batchNormalizationLayerreluLayerconvolution2dLayer (or convolution3dLayer for 3-D networks) with filters of size 3 and a stride of size 1batchNormalizationLayeradditionLayerreluLayerThe standard block has a residual connection from the output of the previous block to the addition layer.Set the position of the addition layer using theResidualBlockType argument. The downsampling residual block is the same as the standard block (either with or without the bottleneck) but with a stride of size 2 in the first convolutional layer and additional layers on the residual connection. The layers on the residual connection depend on the value ofResidualBlockType. If ResidualBlockType is"batchnorm-before-add", then the second branch contains a convolution2dLayer (or convolution3dLayer for 3-D networks) with filters of size 1 and a stride of size 2, and a batchNormalizationLayer.If ResidualBlockType is"batchnorm-after-add", then the second branch contains a convolution2dLayer (or convolution3dLayer for 3-D networks) with filters of size 1 and a stride of 2.The downsampling block halves the height and width of the input, and increases the number of channels. A residual network ends with these layers, in order: globalAveragePooling2dLayer (or globalAveragePooling3dLayer for 3-D networks)fullyConnectedLayersoftmaxLayer
Example Visualization Initial layers of a residual network. Example of an initial residual block for a network without a bottleneck and with the batch normalization layer before the addition layer. Example of an initial residual block in a residual network. Example of the standard residual block for a network with a bottleneck and with the batch normalization layer before the addition layer. Example of a standard residual block in a residual network with bottleneck units. Example of the standard residual block for a network without a bottleneck and with the batch normalization layer before the addition layer. Example of a standard residual block in a residual network without bottleneck units. Example of a downsampling residual block for a network without a bottleneck and with the batch normalization layer before the addition layer. Example of a downsampling residual block in a residual network without bottleneck units. Final layers of a residual network.

The convolution and fully connected layer weights are initialized using the He weight initialization method [3].

Tips

References

[1] He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. “Deep Residual Learning for Image Recognition.” Preprint, submitted December 10, 2015. https://arxiv.org/abs/1512.03385.

[2] He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. “Identity Mappings in Deep Residual Networks.” Preprint, submitted July 25, 2016. https://arxiv.org/abs/1603.05027.

[3] He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. "Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification." In Proceedings of the 2015 IEEE International Conference on Computer Vision, 1026–34. Washington, DC: IEEE Computer Vision Society, 2015.

Version History

Introduced in R2024a