A feature input layer inputs feature data to a neural network and applies data normalization. Use this layer when you have a data set of numeric scalars representing features (data without spatial or time dimensions).
A point cloud input layer inputs 3-D point clouds to a network and applies data normalization. You can also input point cloud data such as 2-D lidar scans.
A 2-D grouped convolutional layer separates the input channels into groups and applies sliding convolutional filters. Use grouped convolutional layers for channel-wise separable (also known as depth-wise separable) convolution.
An LSTM projected layer is an RNN layer that learns long-term dependencies between time steps in time-series and sequence data using projected learnable weights.
A bidirectional LSTM (BiLSTM) layer is an RNN layer that learns bidirectional long-term dependencies between time steps of time-series or sequence data. These dependencies can be useful when you want the RNN to learn from the complete time series at each time step.
A clipped ReLU layer performs a threshold operation, where any input value less than zero is set to zero and any value above the clipping ceiling is set to that clipping ceiling.
A PReLU layer performs a threshold operation, where for each channel, any input value less than zero is multiplied by a scalar learned at training time.
A SReLU layer performs a thresholding operation, where for each channel, the layer scales values outside an interval. The interval thresholds and scaling factors are learnable parameters.
A batch normalization layer normalizes a mini-batch of data across all observations for each channel independently. To speed up training of the convolutional neural network and reduce the sensitivity to network initialization, use batch normalization layers between convolutional layers and nonlinearities, such as ReLU layers.
A group normalization layer normalizes a mini-batch of data across grouped subsets of channels for each observation independently. To speed up training of the convolutional neural network and reduce the sensitivity to network initialization, use group normalization layers between convolutional layers and nonlinearities, such as ReLU layers.
An instance normalization layer normalizes a mini-batch of data across each channel for each observation independently. To improve the convergence of training the convolutional neural network and reduce the sensitivity to network hyperparameters, use instance normalization layers between convolutional layers and nonlinearities, such as ReLU layers.
A layer normalization layer normalizes a mini-batch of data across all channels for each observation independently. To speed up training of recurrent and multilayer perceptron neural networks and reduce the sensitivity to network initialization, use layer normalization layers after the learnable layers, such as LSTM and fully connected layers.
An identity layer is a layer whose output is identical to its input. You can use an identity layer to create a skip connection, which allows the input to skip one or more layers in the main branch of a neural network. For more information about skip connections, see More About.
A 3-D average pooling layer performs downsampling by dividing three-dimensional input into cuboidal pooling regions, then computing the average values of each region.
A 2-D adaptive average pooling layer performs downsampling to give you the desired output size by dividing the input into rectangular pooling regions, then computing the average of each region.
A 3-D max pooling layer performs downsampling by dividing three-dimensional input into cuboidal pooling regions, then computing the maximum of each region.
A concatenation layer takes inputs and concatenates them along a specified dimension. The inputs must have the same size in all dimensions except the concatenation dimension.
An ROI max pooling layer outputs fixed size feature maps for every rectangular ROI within the input feature map. Use this layer to create a Fast or Faster R-CNN object detection network.
An ROI align layer outputs fixed size feature maps for every rectangular ROI within an input feature map. Use this layer to create a Mask R-CNN network.
A transform layer of the you only look once version 2 (YOLO v2) network transforms the bounding box predictions of the last convolution layer in the network to fall within the bounds of the ground truth. Use the transform layer to improve the stability of the YOLO v2 network.
A space to depth layer permutes the spatial blocks of the input into the depth dimension. Use this layer when you need to combine feature maps of different size without discarding any feature data.