maxpool - Pool data to maximum value - MATLAB (original) (raw)
Pool data to maximum value
Syntax
Description
The maximum pooling operation performs downsampling by dividing the input into pooling regions and computing the maximum value of each region.
The maxpool
function applies the maximum pooling operation to dlarray data.Using dlarray
objects makes working with high dimensional data easier by allowing you to label the dimensions. For example, you can label which dimensions correspond to spatial, time, channel, and batch dimensions using the"S"
, "T"
, "C"
, and"B"
labels, respectively. For unspecified and other dimensions, use the"U"
label. For dlarray
object functions that operate over particular dimensions, you can specify the dimension labels by formatting thedlarray
object directly, or by using the DataFormat
option.
[Y](#mw%5F900bf4e0-a5d0-48e5-b965-8e4730b593e0%5Fsep%5Fmw%5Fc8c9f66e-db98-4bf9-b126-f0ea5d6ff98c) = maxpool([X](#mw%5F900bf4e0-a5d0-48e5-b965-8e4730b593e0%5Fsep%5Fmw%5Fb7e3b93f-d4e2-4e5f-8c9b-fc92609d6f5c),[poolsize](#mw%5F900bf4e0-a5d0-48e5-b965-8e4730b593e0%5Fsep%5Fmw%5Fe5c47fe1-f9be-42f3-abf9-0867a5dfe210))
applies the maximum pooling operation to the formatted dlarray
objectX
. The function downsamples the input by dividing it into regions defined by poolsize
and calculating the maximum value of the data in each region. The output Y
is a formatted dlarray
with the same dimension format as X
.
The function, by default, pools over up to three dimensions ofX
labeled "S"
(spatial). To pool over dimensions labeled "T"
(time), specify a pooling region with a "T"
dimension using the PoolFormat option.
For unformatted input data, use the 'DataFormat' option.
[[Y](#mw%5F900bf4e0-a5d0-48e5-b965-8e4730b593e0%5Fsep%5Fmw%5Fc8c9f66e-db98-4bf9-b126-f0ea5d6ff98c),[indx](#mw%5Fd2b7b49e-aeb7-4fa1-aa06-5c381f2c50d6),[inputSize](#mw%5Fe0bfb6d3-c9e0-42c4-a1d4-85a2a9b5253c)] = maxpool([X](#mw%5F900bf4e0-a5d0-48e5-b965-8e4730b593e0%5Fsep%5Fmw%5Fb7e3b93f-d4e2-4e5f-8c9b-fc92609d6f5c),[poolsize](#mw%5F900bf4e0-a5d0-48e5-b965-8e4730b593e0%5Fsep%5Fmw%5Fe5c47fe1-f9be-42f3-abf9-0867a5dfe210))
also returns the linear indices of the maximum value within each pooled region and the size of the input feature map X
for use with the maxunpool function.
[Y](#mw%5F900bf4e0-a5d0-48e5-b965-8e4730b593e0%5Fsep%5Fmw%5Fc8c9f66e-db98-4bf9-b126-f0ea5d6ff98c) = maxpool([X](#mw%5F900bf4e0-a5d0-48e5-b965-8e4730b593e0%5Fsep%5Fmw%5Fb7e3b93f-d4e2-4e5f-8c9b-fc92609d6f5c),'global')
computes the global maximum over the spatial dimensions of the input X
. This syntax is equivalent to setting poolsize
in the previous syntaxes to the size of the 'S'
dimensions of X
.
___ = maxpool(___,'DataFormat',FMT)
applies the maximum pooling operation to the unformatted dlarray
objectX with format specified by FMT
using any of the previous syntaxes. The output Y is an unformatteddlarray
object with dimensions in the same order asX
. For example, 'DataFormat','SSCB'
specifies data for 2-D maximum pooling with format 'SSCB'
(spatial, spatial, channel, batch).
___ = maxpool(___,[Name,Value](#namevaluepairarguments))
specifies options using one or more name-value pair arguments. For example,'PoolFormat','T'
specifies a pooling region for 1-D pooling with format'T'
(time).
Examples
Perform 2-D Maximum Pooling
Create a formatted dlarray
object containing a batch of 128 28-by-28 images with 3 channels. Specify the format 'SSCB'
(spatial, spatial, channel, batch).
miniBatchSize = 128; inputSize = [28 28]; numChannels = 3; X = rand(inputSize(1),inputSize(2),numChannels,miniBatchSize); dlX = dlarray(X,'SSCB');
View the size and format of the input data.
Apply 2-D maximum pooling with 2-by-2 pooling windows using the maxpool
function.
poolSize = [2 2]; dlY = maxpool(dlX,poolSize);
View the size and format of the output.
Perform 2-D Global Maximum Pooling
Create a formatted dlarray
object containing a batch of 128 28-by-28 images with 3 channels. Specify the format 'SSCB'
(spatial, spatial, channel, batch).
miniBatchSize = 128; inputSize = [28 28]; numChannels = 3; X = rand(inputSize(1),inputSize(2),numChannels,miniBatchSize); dlX = dlarray(X,'SSCB');
View the size and format of the input data.
Apply 2-D global maximum pooling using the maxpool
function by specifying the 'global'
option.
dlY = maxpool(dlX,'global');
View the size and format of the output.
Perform 1-D Maximum Pooling
Create a formatted dlarray
object containing a batch of 128 sequences of length 100 with 12 channels. Specify the format 'CBT'
(channel, batch, time).
miniBatchSize = 128; sequenceLength = 100; numChannels = 12; X = rand(numChannels,miniBatchSize,sequenceLength); dlX = dlarray(X,'CBT');
View the size and format of the input data.
Apply 1-D maximum pooling with pooling regions of size 2 with a stride of 2 using the maxpool
function by specifying the 'PoolFormat'
and 'Stride'
options.
poolSize = 2; dlY = maxpool(dlX,poolSize,'PoolFormat','T','Stride',2);
View the size and format of the output.
Unpool 2-D Maximum Pooled Data
Create a formatted dlarray
object containing a batch of 128 28-by-28 images with 3 channels. Specify the format 'SSCB'
(spatial, spatial, channel, batch).
miniBatchSize = 128; inputSize = [28 28]; numChannels = 3; X = rand(inputSize(1),inputSize(2),numChannels,miniBatchSize); dlX = dlarray(X,'SSCB');
View the size and format of the input data.
Pool the data to maximum values over pooling regions of size 2 using a stride of 2.
[dlY,indx,dataSize] = maxpool(dlX,2,'Stride',2);
View the size and format of the pooled data.
View the data size.
dataSize = 1×4
28 28 3 128
Unpool the data using the indices and data size from the maxpool
operation.
dlY = maxunpool(dlY,indx,dataSize);
View the size and format of the unpooled data.
Unpool 1-D Maximum Pooled Data
Create a formatted dlarray
object containing a batch of 128 sequences of length 100 with 12 channels. Specify the format 'CBT'
(channel, batch, time).
miniBatchSize = 128; sequenceLength = 100; numChannels = 12; X = rand(numChannels,miniBatchSize,sequenceLength); dlX = dlarray(X,'CBT');
View the size and format of the input data.
Apply 1-D maximum pooling with pooling regions of size 2 with a stride of 2 using the maxpool
function by specifying the 'PoolFormat'
and 'Stride'
options.
poolSize = 2; [dlY,indx,dataSize] = maxpool(dlX,poolSize,'PoolFormat','T','Stride',2);
View the size and format of the output.
Unpool the data using the indices and data size from the maxpool
operation.
dlY = maxunpool(dlY,indx,dataSize);
View the size and format of the unpooled data.
Input Arguments
X
— Input data
dlarray
Input data, specified as a formatted or unformatted dlarray
object.
If X
is an unformatted dlarray
, then you must specify the format using the DataFormat option.
The function, by default, pools over up to three dimensions ofX
labeled "S"
(spatial). To pool over dimensions labeled "T"
(time), specify a pooling region with a "T"
dimension using the PoolFormat option.
poolsize
— Size of pooling regions
positive integer | vector of positive integers
Size of the pooling regions, specified as a numeric scalar or numeric vector.
To pool using a pooling region with edges of the same size, specify poolsize
as a scalar. The pooling regions have the same size along all dimensions specified by 'PoolFormat'.
To pool using a pooling region with edges of different sizes, specify poolsize
as a vector, where poolsize(i)
is the size of corresponding dimension in 'PoolFormat'.
Name-Value Arguments
Specify optional pairs of arguments asName1=Value1,...,NameN=ValueN
, where Name
is the argument name and Value
is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.
Before R2021a, use commas to separate each name and value, and enclose Name
in quotes.
Example: 'Stride',2
specifies the stride of the pooling regions as2
.
DataFormat
— Description of data dimensions
character vector | string scalar
Description of the data dimensions, specified as a character vector or string scalar.
A data format is a string of characters, where each character describes the type of the corresponding data dimension.
The characters are:
"S"
— Spatial"C"
— Channel"B"
— Batch"T"
— Time"U"
— Unspecified
For example, consider an array containing a batch of sequences where the first, second, and third dimensions correspond to channels, observations, and time steps, respectively. You can specify that this array has the format "CBT"
(channel, batch, time).
You can specify multiple dimensions labeled "S"
or "U"
. You can use the labels "C"
, "B"
, and"T"
once each, at most. The software ignores singleton trailing"U"
dimensions after the second dimension.
If the input data is not a formatted dlarray
object, then you must specify the DataFormat
option.
For more information, see Deep Learning Data Formats.
Data Types: char
| string
PoolFormat
— Description of pooling dimensions
character vector | string scalar
Description of pooling dimensions, specified as a character vector or string scalar that provides a label for each dimension of the pooling region.
The default value of PoolFormat
depends on the task:
Task | Default |
---|---|
1-D pooling | "S" (spatial) |
2-D pooling | "SS" (spatial, spatial) |
3-D pooling | "SSS" (spatial, spatial, spatial) |
The format must have either no "S"
(spatial) dimensions, or as many "S"
(spatial) dimensions as the input data.
The function, by default, pools over up to three dimensions ofX labeled "S"
(spatial). To pool over dimensions labeled "T"
(time), specify a pooling region with a "T"
dimension using the PoolFormat option.
For more information, see Deep Learning Data Formats.
Stride
— Step size for traversing input data
1
(default) | numeric scalar | numeric vector
Step size for traversing the input data, specified as the comma-separated pair consisting of'Stride'
and a numeric scalar or numeric vector. If you specify'Stride'
as a scalar, the same value is used for all spatial dimensions. If you specify 'Stride'
as a vector of the same size as the number of spatial dimensions of the input data, the vector values are used for the corresponding spatial dimensions.
The default value of 'Stride'
is 1
. If 'Stride'
is less than poolsize
in any dimension, then the pooling regions overlap.
The Stride
parameter is not supported for global pooling using the'global'
option.
Example: 'Stride',3
Data Types: single
| double
Padding
— Size of padding applied to edges of data
0
(default) | 'same'
| numeric scalar | numeric vector | numeric matrix
Size of padding applied to edges of data, specified as the comma-separated pair consisting of 'Padding'
and one of the following:
'same'
— Padding size is set so that the output size is the same as the input size when the stride is1
. More generally, the output size of each spatial dimension isceil(inputSize/stride)
, whereinputSize
is the size of the input along a spatial dimension.- Numeric scalar — The same amount of padding is applied to both ends of all spatial dimensions.
- Numeric vector — A different amount of padding is applied along each spatial dimension. Use a vector of size
d
, whered
is the number of spatial dimensions of the input data. Thei
th element of the vector specifies the size of padding applied to the start and the end along thei
th spatial dimension. - Numeric matrix — A different amount of padding is applied to the start and end of each spatial dimension. Use a matrix of size 2-by-
d
, whered
is the number of spatial dimensions of the input data. The element(1,d)
specifies the size of padding applied to the start of spatial dimensiond
. The element(2,d)
specifies the size of padding applied to the end of spatial dimensiond
. For example, in 2-D, the format is[top, left; bottom, right]
.
The 'Padding'
parameter is not supported for global pooling using the 'global'
option.
Example: 'Padding','same'
Data Types: single
| double
Output Arguments
Y
— Pooled data
dlarray
Pooled data, returned as a dlarray
with the same underlying data type as X.
If the input data X
is a formatted dlarray
, thenY
has the same format as X
. If the input data is not a formatted dlarray
, then Y
is an unformatted dlarray
with the same dimension order as the input data.
indx
— Indices of maximum values
dlarray
Indices of maximum values in each pooled region, returned as adlarray
. Each value in indx
represents the location of the corresponding maximum value in Y
, given as a linear index of the values in X
.
If X
is a formatted dlarray
,indx
has the same size and format as the outputY
.
If X
is not a formatted dlarray
,indx
is an unformatted dlarray
. In that case,indx
is returned with the following dimension order: all'S'
dimensions, followed by 'C'
,'B'
, and 'T'
dimensions, then all'U'
dimensions. The size of indx
matches the size of Y
when Y
is permuted to match the previously stated dimension order.
Use the indx
output with the maxunpool function to unpool the output of maxpool
.
indx
output is not supported when using the'global'
option.
inputSize
— Size of input feature map
numeric vector
Size of the input feature map, returned as a numeric vector.
Use the inputSize
output with the maxunpool function to unpool the output of maxpool
.
inputSize
output is not supported when using the'global'
option.
Algorithms
Deep Learning Array Formats
Most deep learning networks and functions operate on different dimensions of the input data in different ways.
For example, an LSTM operation iterates over the time dimension of the input data, and a batch normalization operation normalizes over the batch dimension of the input data.
To provide input data with labeled dimensions or input data with additional layout information, you can use data formats.
A data format is a string of characters, where each character describes the type of the corresponding data dimension.
The characters are:
"S"
— Spatial"C"
— Channel"B"
— Batch"T"
— Time"U"
— Unspecified
For example, consider an array containing a batch of sequences where the first, second, and third dimensions correspond to channels, observations, and time steps, respectively. You can specify that this array has the format "CBT"
(channel, batch, time).
To create formatted input data, create a dlarray object and specify the format using the second argument.
To provide additional layout information with unformatted data, specify the formats using the DataFormat andPoolFormat arguments.
For more information, see Deep Learning Data Formats.
Extended Capabilities
C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.
Usage notes and limitations:
- Code generation supports only 1-D and 2-D spatial and spatio-temporal data. Convolving over 3-D spatial and spatio-temporal data format such as "SSS" or "SST" is not supported.
- Code generation supports only channel-wise (depth-wise) separable convolution and regular convolution. Both
NumChannelsPerGroup
andNumFiltersPerGroup
must be equal to 1. - The input must be single underlying data type.
- The convolution dimensions must be fixed size.
- The dimension that corresponds to the channel in the input must be fixed size.
- The
Stride
,DilationFactor
,Padding
andPaddingValue
name-value pairs must be compile-time constants. PaddingValue
must be 0.
GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.
Usage notes and limitations:
- Code generation supports only 1-D and 2-D spatial and spatio-temporal data. Convolving over 3-D spatial and spatio-temporal data format such as "SSS" or "SST" is not supported.
- Code generation supports only channel-wise (depth-wise) separable convolution and regular convolution. Both
NumChannelsPerGroup
andNumFiltersPerGroup
must be equal to 1. - The input must be single underlying data type.
- The convolution dimensions must be fixed size.
- The dimension that corresponds to the channel in the input must be fixed size.
- The
Stride
,DilationFactor
,Padding
andPaddingValue
name-value pairs must be compile-time constants. PaddingValue
must be 0.
GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.
The maxpool
function supports GPU array input with these usage notes and limitations:
- When the input argument
X
is adlarray
with underlying data of typegpuArray
, this function runs on the GPU.
For more information, see Run MATLAB Functions on a GPU (Parallel Computing Toolbox).
Version History
Introduced in R2019b
R2020a: maxpool
indices output argument changes shape and data type
Starting in R2020a, the data type and shape of the indices output argument of themaxpool function are changed. The maxpool
function outputs the indices of the maximum values as a dlarray
with the same shape and format as the pooled data, instead of a numeric vector.
The indices output of maxpool
remains compatible with the indices input of maxunpool
. The maxunpool
function accepts the indices of the maximum values as a dlarray
with the same shape and format as the input data. To prevent errors, use only the indices output of themaxpool
function as the indices input to themaxunpool
function.
To reproduce the previous behavior and obtain the indices output as a numeric vector, use the following code:
[Y,indx,inputSize] = maxpool(Y,poolsize); indx = extractdata(indx); indx = reshape(indx,[],1);