gru - Gated recurrent unit - MATLAB (original) (raw)

Gated recurrent unit

Since R2020a

Syntax

Description

The gated recurrent unit (GRU) operation allows a network to learn dependencies between time steps in time series and sequence data.

Note

This function applies the deep learning GRU operation to dlarray data. If you want to apply a GRU operation within a dlnetwork object, use gruLayer.

[Y](#mw%5F49a05dd1-2e01-44c3-85e3-b5ebe21ef5b4) = gru([X](#mw%5F9e409aba-cda7-4731-8b85-96a312f6d10f),[H0](#mw%5Fabc8c83a-4fb3-4ef5-bb39-b7d79499d12b),[weights](#mw%5F3efb6c9e-295f-43fa-8944-299acaf0957d),[recurrentWeights](#mw%5F747fda94-0f15-4479-ae07-3ebf63468290),[bias](#mw%5F32f673bb-eace-46f2-897a-e20452ccd82f)) applies a gated recurrent unit (GRU) calculation to input X using the initial hidden state H0, and parameters weights,recurrentWeights, and bias. The inputX must be a formatted dlarray. The outputY is a formatted dlarray with the same dimension format as X, except for any "S" dimensions.

The gru function updates the hidden state using the hyperbolic tangent function (tanh) as the state activation function. The gru function uses the sigmoid function given by σ(x)=(1+e−x)−1 as the gate activation function.

example

[[Y](#mw%5F49a05dd1-2e01-44c3-85e3-b5ebe21ef5b4),[hiddenState](#function%5Fgru%5Fsep%5Fmw%5F561b60c1-4959-4191-b31b-9c158841ecd7)] = gru([X](#mw%5F9e409aba-cda7-4731-8b85-96a312f6d10f),[H0](#mw%5Fabc8c83a-4fb3-4ef5-bb39-b7d79499d12b),[weights](#mw%5F3efb6c9e-295f-43fa-8944-299acaf0957d),[recurrentWeights](#mw%5F747fda94-0f15-4479-ae07-3ebf63468290),[bias](#mw%5F32f673bb-eace-46f2-897a-e20452ccd82f)) also returns the hidden state after the GRU operation.

___ = gru([X](#mw%5F9e409aba-cda7-4731-8b85-96a312f6d10f),[H0](#mw%5Fabc8c83a-4fb3-4ef5-bb39-b7d79499d12b),[weights](#mw%5F3efb6c9e-295f-43fa-8944-299acaf0957d),[recurrentWeights](#mw%5F747fda94-0f15-4479-ae07-3ebf63468290),[bias](#mw%5F32f673bb-eace-46f2-897a-e20452ccd82f),DataFormat=FMT) also specifies the dimension format FMT when X is not a formatted dlarray. The output Y is an unformatteddlarray with the same dimension order as X, except for any "S" dimensions.

___ = gru([X](#mw%5F9e409aba-cda7-4731-8b85-96a312f6d10f),[H0](#mw%5Fabc8c83a-4fb3-4ef5-bb39-b7d79499d12b),[weights](#mw%5F3efb6c9e-295f-43fa-8944-299acaf0957d),[recurrentWeights](#mw%5F747fda94-0f15-4479-ae07-3ebf63468290),[bias](#mw%5F32f673bb-eace-46f2-897a-e20452ccd82f),Name=Value) specifies additional options using one or more name-value arguments.

Examples

collapse all

Apply GRU Operation to Sequence Data

Perform a GRU operation using 100 hidden units.

Create the input sequence data as 32 observations with ten channels and a sequence length of 64.

numFeatures = 10; numObservations = 32; sequenceLength = 64;

X = randn(numFeatures,numObservations,sequenceLength); X = dlarray(X,"CBT");

Create the initial hidden state with 100 hidden units. Use the same initial hidden state for all observations.

numHiddenUnits = 100; H0 = zeros(numHiddenUnits,1);

Create the learnable parameters for the GRU operation.

weights = dlarray(randn(3numHiddenUnits,numFeatures)); recurrentWeights = dlarray(randn(3numHiddenUnits,numHiddenUnits)); bias = dlarray(randn(3*numHiddenUnits,1));

Perform the GRU calculation.

[Y,hiddenState] = gru(X,H0,weights,recurrentWeights,bias);

View the size and dimension format of the output.

View the size of the hidden state.

You can use the hidden state to keep track of the state of the GRU operation and input further sequential data.

Input Arguments

collapse all

`X` — Input data

dlarray | numeric array

Input data, specified as a formatted dlarray, an unformatteddlarray, or a numeric array. When X is not a formatted dlarray, you must specify the dimension label format using the DataFormat name-value argument. If X is a numeric array, at least one of H0, weights,recurrentWeights, or bias must be adlarray.

X must contain a sequence dimension labeled "T". IfX has any spatial dimensions labeled "S", they are flattened into the "C" channel dimension. If X does not have a channel dimension, then one is added. If X has any unspecified dimensions labeled "U", they must be singleton.