dlconv - Deep learning convolution - MATLAB (original) (raw)

Deep learning convolution

Syntax

Description

The convolution operation applies sliding filters to the input data. Use the dlconv function for deep learning convolution, grouped convolution, and channel-wise separable convolution.

The dlconv function applies the deep learning convolution operation to dlarray data.Using dlarray objects makes working with high dimensional data easier by allowing you to label the dimensions. For example, you can label which dimensions correspond to spatial, time, channel, and batch dimensions using the"S", "T", "C", and"B" labels, respectively. For unspecified and other dimensions, use the"U" label. For dlarray object functions that operate over particular dimensions, you can specify the dimension labels by formatting thedlarray object directly, or by using the DataFormat option.

[Y](#mw%5F03ed747a-bbf8-4160-bac0-0222956746ac) = dlconv([X](#mw%5Fa3925361-6389-4fa7-ac9c-019d4725b9e3),[weights](#mw%5Fdae295dd-fe2a-4d76-a205-d5c094e302f5),[bias](#mw%5F3a85528d-dae8-4b8a-badf-58dcfafbc599)) applies the deep learning convolution operation to the formatteddlarray object X. The function uses sliding convolutional filters defined by weights and adds the constantbias. The output Y is a formatteddlarray object with the same format asX.

The function, by default, convolves over up to three dimensions of X labeled "S" (spatial). To convolve over dimensions labeled "T" (time), specify weights with a"T" dimension using a formatted dlarray object or by using the WeightsFormat option.

For unformatted input data, use the DataFormat option.

example

[Y](#mw%5F03ed747a-bbf8-4160-bac0-0222956746ac) = dlconv([X](#mw%5Fa3925361-6389-4fa7-ac9c-019d4725b9e3),[weights](#mw%5Fdae295dd-fe2a-4d76-a205-d5c094e302f5),[bias](#mw%5F3a85528d-dae8-4b8a-badf-58dcfafbc599),DataFormat=FMT) applies the deep learning convolution operation to the unformatteddlarray object X with format specified byFMT. The output Y is an unformatteddlarray object with dimensions in the same order asX. For example, DataFormat="SSCB" specifies data for 2-D convolution with format "SSCB" (spatial, spatial, channel, batch).

example

[Y](#mw%5F03ed747a-bbf8-4160-bac0-0222956746ac) = dlconv(___,[Name=Value](#namevaluepairarguments)) specifies options using one or more name-value pair arguments using any of the previous syntaxes. For example, WeightsFormat="TCU" specifies weights for 1-D convolution with format "TCU" (time, channel, unspecified).

example

Examples

collapse all

Perform 2-D Convolution

Create a formatted dlarray object containing a batch of 128 28-by-28 images with 3 channels. Specify the format "SSCB" (spatial, spatial, channel, batch).

miniBatchSize = 128; inputSize = [28 28]; numChannels = 3; X = rand(inputSize(1),inputSize(2),numChannels,miniBatchSize); X = dlarray(X,"SSCB");

View the size and format of the input data.

Initialize the weights and bias for 2-D convolution. For the weights, specify 64 3-by-3 filters. For the bias, specify a vector of zeros.

filterSize = [3 3]; numFilters = 64; weights = rand(filterSize(1),filterSize(2),numChannels,numFilters); bias = zeros(1,numFilters);

Apply 2-D convolution using the dlconv function.

Y = dlconv(X,weights,bias);

View the size and format of the output.

Perform Grouped Convolution

Convolve the input data in three groups of two channels each. Apply four filters per group.

Create the input data as 10 observations of size 100-by-100 with six channels.

height = 100; width = 100; channels = 6; numObservations = 10;

X = rand(height,width,channels,numObservations); X = dlarray(X,"SSCB");

Initialize the convolutional filters. Specify three groups of convolutions that each apply four convolution filters to two channels of the input data.

filterHeight = 8; filterWidth = 8; numChannelsPerGroup = 2; numFiltersPerGroup = 4; numGroups = 3;

weights = rand(filterHeight,filterWidth,numChannelsPerGroup,numFiltersPerGroup,numGroups);

Initialize the bias term.

bias = rand(numFiltersPerGroup*numGroups,1);

Perform the convolution.

Y = dlconv(X,weights,bias); size(Y)

The 12 channels of the convolution output represent the three groups of convolutions with four filters per group.

Perform Channel-Wise Separable Convolution

Separate the input data into channels and perform convolution on each channel separately.

Create the input data as a single observation with a size of 64-by-64 and 10 channels. Create the data as an unformatted dlarray.

height = 64; width = 64; numChannels = 10;

X = rand(height,width,numChannels); X = dlarray(X);

Initialize the convolutional filters. Specify an ungrouped convolution that applies a single convolution to all three channels of the input data.

filterHeight = 8; filterWidth = 8; numChannelsPerGroup = 1; numFiltersPerGroup = 1; numGroups = numChannels;

weights = rand(filterHeight,filterWidth,numChannelsPerGroup,numFiltersPerGroup,numGroups);

Initialize the bias term.

bias = rand(numFiltersPerGroup*numGroups,1);

Perform the convolution. Specify the dimension labels of the input data using the DataFormat option.

Y = dlconv(X,weights,bias,DataFormat="SSC"); size(Y)

Each channel is convolved separately, so there are 10 channels in the output.

Perform 1-D Convolution

Create a formatted dlarray object containing 128 sequences of length 512 containing 5 features. Specify the format "CBT" (channel, batch, time).

numChannels = 5; miniBatchSize = 128; sequenceLength = 512; X = rand(numChannels,miniBatchSize,sequenceLength); X = dlarray(X,"CBT");

Initialize the weights and bias for 1-D convolution. For the weights, specify 64 filters with a filter size of 3. For the bias, specify a vector of zeros.

filterSize = 3; numFilters = 64; weights = rand(filterSize,numChannels,numFilters); bias = zeros(1,numFilters);

Apply 1-D convolution using the dlconv function. To convolve over the "T" (time) dimension of the input data, specify the weights format "TCU" (time, channel, unspecified) using the WeightsFormat option.

Y = dlconv(X,weights,bias,WeightsFormat="TCU");

View the size and format of the output.

Input Arguments

collapse all

`X` — Input data

dlarray | numeric array

Input data, specified as a formatted dlarray, an unformatted dlarray, or a numeric array.

If X is an unformatted dlarray or a numeric array, then you must specify the format using the DataFormat option. If X is a numeric array, then either weights or bias must be a dlarray object.

`weights` — Convolutional filters

dlarray | numeric array

Convolutional filters, specified as a formatted dlarray, an unformatted dlarray, or a numeric array.

The size and format of the weights depends on the type of task. Ifweights is an unformatted dlarray or a numeric array, then the size and shape of weights depends on the WeightsFormat option.

The following table describes the size and format of the weights for various tasks. You can specify an array with the dimensions in any order using formatted dlarray objects or by using theWeightsFormat option. When the weights has multiple dimensions with the same label (for example, multiple dimensions labeled"S"), then those dimensions must be in ordered as described in this table.

Task	Required Dimensions	Size	Example
Weights	Format
1-D convolution	"S" (spatial) or"T" (time)	Filter size	filterSize-by-numChannels-by-numFilters array, where filterSize is the size of the 1-D filters,numChannels is the number of channels of the input data, andnumFilters is the number of filters.	"SCU" (spatial, channel, unspecified)
"C" (channel)	Number of channels
"U" (unspecified)	Number of filters
1-D grouped convolution	"S" (spatial) or"T" (time)	Filter size	filterSize-by-numChannelsPerGroup-by-numFiltersPerGroup-by-numGroups array, where filterSize is the size of the 1-D filters,numChannelsPerGroup is the number of channels per group of the input data,numFiltersPerGroup is the number of filters per group, andnumGroups is the number of groups.numChannelsPerGroup must equal the number of the channels of the input data divided bynumGroups.	"SCUU" (spatial, channel, unspecified, unspecified)
"C" (channel)	Number of channels per group
First "U" (unspecified)	Number of filters per group
Second "U" (unspecified)	Number of groups
2-D convolution	First "S" (spatial)	Filter height	filterSize(1)-by-filterSize(2)-by-numChannels-by-numFilters array, where filterSize(1) andfilterSize(2) are the height and width of the 2-D filters, respectively,numChannels is the number of channels of the input data, andnumFilters is the number of filters.	"SSCU" (spatial, spatial, channel, unspecified)
Second "S" (spatial) or"T" (time)	Filter width
"C" (channel)	Number of channels
"U" (unspecified)	Number of filters
2-D grouped convolution	First "S" (spatial)	Filter height	filterSize(1)-by-filterSize(2)-by-numChannelsPerGroup-by-numFiltersPerGroup-by-numGroups array, where filterSize(1) andfilterSize(2) are the height and width of the 2-D filters, respectively,numChannelsPerGroup is the number of channels per group of the input data,numFiltersPerGroup is the number of filters per group, andnumGroups is the number of groups.numChannelsPerGroup must equal the number of the channels of the input data divided bynumGroups.	"SSCUU" (spatial, spatial, channel, unspecified, unspecified)
Second "S" (spatial) or"T" (time)	Filter width
"C" (channel)	Number of channels per group
First "U" (unspecified)	Number of filters per group
Second "U" (unspecified)	Number of groups
3-D convolution	First "S" (spatial)	Filter height	filterSize(1)-by-filterSize(2)-by-filterSize(3)-by-numChannels-by-numFilters array, where filterSize(1),filterSize(2), andfilterSize(3) are the height, width, and depth of the 3-D filters, respectively,numChannels is the number of channels of the input data, andnumFilters is the number of filters.	"SSSCU" (spatial, spatial, spatial, channel, unspecified)
Second "S" (spatial)	Filter width
Third "S" (spatial) or"T" (time)	Filter depth
"C" (channel)	Number of channels
"U" (unspecified)	Number of filters

For channel-wise separable (also known as depth-wise separable) convolution, use grouped convolution with number of groups equal to the number of channels.

Tip

The function, by default, convolves over up to three dimensions of X labeled "S" (spatial). To convolve over dimensions labeled "T" (time), specify weights with a"T" dimension using a formatted dlarray object or by using the WeightsFormat option.

`bias` — Bias constant

dlarray | numeric vector | numeric scalar

Bias constant, specified as a formatted dlarray, an unformatted dlarray, a numeric vector, or a numeric scalar.

If bias is a scalar, then the same bias is applied to each output.
If bias has a nonsingleton dimension, then each element of bias is the bias applied to the corresponding convolutional filter specified byweights. The number of elements ofbias must match the number of filters specified by weights.
If bias is 0, then the bias term is disabled and no bias is added during the convolution operation.

If bias is a formatted dlarray, then the nonsingleton dimension must be a channel dimension with label'C' (channel).

Name-Value Arguments

Specify optional pairs of arguments asName1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: DilationFactor=2 sets the dilation factor for each convolutional filter to 2.

`DataFormat` — Description of data dimensions

character vector | string scalar

Description of the data dimensions, specified as a character vector or string scalar.

A data format is a string of characters, where each character describes the type of the corresponding data dimension.

The characters are:

"S" — Spatial
"C" — Channel
"B" — Batch
"T" — Time
"U" — Unspecified

For example, consider an array containing a batch of sequences where the first, second, and third dimensions correspond to channels, observations, and time steps, respectively. You can specify that this array has the format "CBT" (channel, batch, time).

You can specify multiple dimensions labeled "S" or "U". You can use the labels "C", "B", and"T" once each, at most. The software ignores singleton trailing"U" dimensions after the second dimension.

If the input data is not a formatted dlarray object, then you must specify the DataFormat option.

For more information, see Deep Learning Data Formats.

Data Types: char | string

`WeightsFormat` — Description of weights dimensions

character vector | string scalar

Description of weights dimensions, specified as a character vector or string scalar.

A data format is a string of characters, where each character describes the type of the corresponding dimension of the data.

The characters are:

"S" — Spatial
"C" — Channel
"B" — Batch
"T" — Time
"U" — Unspecified

The default value of WeightsFormat depends on the task:

Task	Default
1-D convolution	"SCU" (spatial, channel, unspecified)
1-D grouped convolution	"SCUU" (spatial, channel, unspecified, unspecified)
2-D convolution	"SSCU" (spatial, spatial, channel, unspecified)
2-D grouped convolution	"SSCUU" (spatial, spatial, channel, unspecified, unspecified)
3-D convolution	"SSSCU" (spatial, spatial, spatial, channel, unspecified)

The supported combinations of dimension labels depends on the type of convolution, for more information, see the weights argument.

For more information, see Deep Learning Data Formats.

Tip

Data Types: char | string

`Stride` — Step size for traversing input data

1 (default) | numeric scalar | numeric vector

Step size for traversing the input data, specified as a numeric scalar or numeric vector.

To use the same step size for all convolution dimensions, specify the stride as a scalar. To specify a different value for each convolution dimension, specify the stride as a vector with elements ordered corresponding to the dimensions labels in the data format.

`DilationFactor` — Filter dilation factor

1 (default) | numeric scalar | numeric vector

Filter dilation factor, specified as specified as a numeric scalar or numeric vector.

To use the dilation factor all convolution dimensions, specify the dilation factor as a scalar. To specify a different value for each convolution dimension, specify the dilation factor as a vector with elements ordered corresponding to the dimensions labels in the data format.

Use the dilation factor to increase the receptive field of the filter (the area of the input that the filter can see) on the input data. Using a dilation factor corresponds to an effective filter size of filterSize + (filterSize-1)*(dilationFactor-1).

`Padding` — Size of padding

Size of padding applied to the "S" and"T" dimensions given by the format of the weights, specified as one of the following:

"same" — Apply padding such that the output dimension sizes areceil(inputSize/stride), whereinputSize is the size of the corresponding input dimension. WhenStride is 1, the output is the same size as the input.
"causal" – Apply left padding with size(FilterSize - 1) .*DilationFactor. This option supports convolving over a single time or spatial dimension only. When Stride is 1, the output is the same size as the input.
Nonnegative integer sz — Add padding of size sz to both ends of the"S" or "T" dimensions given by the format of the weights.
Vector of integers sz — Add padding of size sz(i) to both ends of theith "S" or"T" dimensions given by the format of the weights. The number of elements of sz must match the number of "S" or"T" dimensions of the weights.
Matrix of integers sz — Add padding of size sz(1,i) andsz(2,i) to the start and end of theith "S" or"T" dimensions given by the format of the weights. For example, for 2-D input, [t l; b r] applies padding of sizet, b,l, and r to the top, bottom, left, and right of the input, respectively.

`PaddingValue` — Value to pad data

0 (default) | scalar | "symmetric-include-edge" | "symmetric-exclude-edge" | "replicate"

Value to pad data, specified as one of these values:

PaddingValue	Description	Example
Scalar	Pad with the specified scalar value.	[314159265]→[0000000000000000314000015900002650000000000000000]
"symmetric-include-edge"	Pad using mirrored values of the input, including the edge values.	[314159265]→[5115995133144113314415115995622655662265565115995]
"symmetric-exclude-edge"	Pad using mirrored values of the input, excluding the edge values.	[314159265]→[5626562951595141314139515951562656295159514131413]
"replicate"	Pad using repeated border elements of the input	[314159265]→[3331444333144433314441115999222655522265552226555]

Output Arguments

collapse all

`Y` — Convolved feature map

dlarray

Convolved feature map, returned as a dlarray with the same underlying data type as X.

If the input data X is a formatteddlarray, then Y has the same format as X. If the input data is not a formatteddlarray, then Y is an unformatteddlarray with the same dimension order as the input data.

The size of the "C" (channel) dimension ofY depends on the task.

Task	Size of "C" Dimension
Convolution	Number of filters
Grouped convolution	Number of filters per group multiplied by the number of groups

More About

collapse all

Deep Learning Convolution

The dlconv function applies sliding convolution filters to the input data. The dlconv function supports convolution in one, two, or three spatial dimensions or one time dimension. To learn more about deep learning convolution, see the definition of convolutional layer on the convolution2dLayer reference page.

Deep Learning Array Formats

Most deep learning networks and functions operate on different dimensions of the input data in different ways.

For example, an LSTM operation iterates over the time dimension of the input data, and a batch normalization operation normalizes over the batch dimension of the input data.

To provide input data with labeled dimensions or input data with additional layout information, you can use data formats.

A data format is a string of characters, where each character describes the type of the corresponding data dimension.

The characters are:

"S" — Spatial
"C" — Channel
"B" — Batch
"T" — Time
"U" — Unspecified

To create formatted input data, create a dlarray object and specify the format using the second argument.

To provide additional layout information with unformatted data, specify the formats using the DataFormat and WeightsFormat arguments.

For more information, see Deep Learning Data Formats.

Extended Capabilities

C/C++ Code Generation

Generate C and C++ code using MATLAB® Coder™.

Usage notes and limitations:

Code generation supports only 1-D and 2-D spatial and spatio-temporal data. Convolving over 3-D spatial and spatio-temporal data format such as "SSS" or "SST" is not supported.
Code generation supports only channel-wise (depth-wise) separable convolution and regular convolution. Both NumChannelsPerGroup andNumFiltersPerGroup must be equal to 1.
The input must be single underlying data type.
The convolution dimensions must be fixed size.
The dimension that corresponds to the channel in the input must be fixed size.
The Stride, DilationFactor,Padding and PaddingValue name-value pairs must be compile-time constants.
PaddingValue must be 0.

GPU Code Generation

Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

Usage notes and limitations:

Code generation supports only 1-D and 2-D spatial and spatio-temporal data. Convolving over 3-D spatial and spatio-temporal data format such as "SSS" or "SST" is not supported.
Code generation supports only channel-wise (depth-wise) separable convolution and regular convolution. Both NumChannelsPerGroup andNumFiltersPerGroup must be equal to 1.
The input must be single underlying data type.
The convolution dimensions must be fixed size.
The dimension that corresponds to the channel in the input must be fixed size.
The Stride, DilationFactor,Padding and PaddingValue name-value pairs must be compile-time constants.
PaddingValue must be 0.

GPU Arrays

Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

The dlconv function supports GPU array input with these usage notes and limitations:

When at least one of the following input arguments is agpuArray or a dlarray with underlying data of type gpuArray, this function runs on the GPU.
- X
- weights
- bias

For more information, see Run MATLAB Functions on a GPU (Parallel Computing Toolbox).

Version History

Introduced in R2019b

dlconv - Deep learning convolution - MATLAB (original) (raw)

Syntax

Description

Examples

Perform 2-D Convolution

Perform Grouped Convolution

Perform Channel-Wise Separable Convolution

Perform 1-D Convolution

Input Arguments

X — Input data

weights — Convolutional filters

bias — Bias constant

Name-Value Arguments

DataFormat — Description of data dimensions

WeightsFormat — Description of weights dimensions

Stride — Step size for traversing input data

DilationFactor — Filter dilation factor

Padding — Size of padding

PaddingValue — Value to pad data

Output Arguments

Y — Convolved feature map

More About

Deep Learning Convolution

Deep Learning Array Formats

Extended Capabilities

C/C++ Code Generation

GPU Code Generation

GPU Arrays

Version History

`X` — Input data

`weights` — Convolutional filters

`bias` — Bias constant

`DataFormat` — Description of data dimensions

`WeightsFormat` — Description of weights dimensions

`Stride` — Step size for traversing input data

`DilationFactor` — Filter dilation factor

`Padding` — Size of padding

`PaddingValue` — Value to pad data

`Y` — Convolved feature map