huber - Huber loss for regression tasks - MATLAB (original) (raw)

Huber loss for regression tasks

Since R2021a

Syntax

Description

The Huber operation computes the Huber loss between network predictions and target values for regression tasks. When the 'TransitionPoint' option is 1, this is also known as smooth L1 loss.

The huber function calculates the Huber loss using dlarray data.Using dlarray objects makes working with high dimensional data easier by allowing you to label the dimensions. For example, you can label which dimensions correspond to spatial, time, channel, and batch dimensions using the"S", "T", "C", and"B" labels, respectively. For unspecified and other dimensions, use the"U" label. For dlarray object functions that operate over particular dimensions, you can specify the dimension labels by formatting thedlarray object directly, or by using the DataFormat option.

[loss](#mw%5F98e718af-017f-436a-8523-ea29fbcbd044) = huber([Y](#mw%5F1c0dc671-d9f3-498d-a0db-54a7ed2a4d0c%5Fsep%5Fmw%5F72d38514-0d45-4bcf-9259-997edf0cb0c8),[targets](#mw%5F1c0dc671-d9f3-498d-a0db-54a7ed2a4d0c%5Fsep%5Fmw%5F7d699de8-af5e-4d74-8756-c0027247b11e)) returns the Huber loss between the formatted dlarray objectY containing the predictions and the target valuestargets for regression tasks. The input Y is a formatted dlarray. The output loss is an unformatteddlarray scalar.

For unformatted input data, use the 'DataFormat' option.

example

[loss](#mw%5F98e718af-017f-436a-8523-ea29fbcbd044) = huber([Y](#mw%5F1c0dc671-d9f3-498d-a0db-54a7ed2a4d0c%5Fsep%5Fmw%5F72d38514-0d45-4bcf-9259-997edf0cb0c8),[targets](#mw%5F1c0dc671-d9f3-498d-a0db-54a7ed2a4d0c%5Fsep%5Fmw%5F7d699de8-af5e-4d74-8756-c0027247b11e),[weights](#mw%5Ffa6bafa8-3ba0-4fba-8cf4-1a5561cf88dd)) applies weights to the calculated loss values. Use this syntax to weight the contributions of classes, observations, or regions of the input to the calculated loss values.

[loss](#mw%5F98e718af-017f-436a-8523-ea29fbcbd044) = huber(___,'DataFormat',FMT) also specifies the dimension format FMT when Y is not a formatted dlarray.

[loss](#mw%5F98e718af-017f-436a-8523-ea29fbcbd044) = huber(___,[Name,Value](#namevaluepairarguments)) specifies options using one or more name-value pair arguments in addition to the input arguments in previous syntaxes. For example,'NormalizationFactor','all-elements' specifies to normalize the loss by dividing the reduced loss by the number of input elements.

Examples

collapse all

Huber Loss

Create an array of predictions for 12 observations over 10 responses.

numResponses = 10; numObservations = 12;

Y = rand(numResponses,numObservations); dlY = dlarray(Y,'CB');

View the size and format of the predictions.

Create an array of random targets.

targets = rand(numResponses,numObservations);

View the size of the targets.

Compute the Huber loss between the predictions and the targets.

loss = huber(dlY,targets)

loss = 1x1 dlarray

0.7374

Masked Huber Loss for Padded Sequences

Create arrays of predictions and targets for 12 sequences of varying lengths over 10 responses.

numResponses = 10; numObservations = 12; maxSequenceLength = 15;

sequenceLengths = randi(maxSequenceLength,[1 numObservations]);

Y = cell(numObservations,1); targets = cell(numObservations,1);

for i = 1:numObservations Y{i} = rand(numResponses,sequenceLengths(i)); targets{i} = rand(numResponses,sequenceLengths(i)); end

View the cell arrays of predictions and targets.

Y=12×1 cell array {10x13 double} {10x14 double} {10x2 double} {10x14 double} {10x10 double} {10x2 double} {10x5 double} {10x9 double} {10x15 double} {10x15 double} {10x3 double} {10x15 double}

targets=12×1 cell array {10x13 double} {10x14 double} {10x2 double} {10x14 double} {10x10 double} {10x2 double} {10x5 double} {10x9 double} {10x15 double} {10x15 double} {10x3 double} {10x15 double}

Pad the prediction and target sequences in the second dimension using the padsequences function and also return the corresponding mask.

[Y,mask] = padsequences(Y,2); targets = padsequences(targets,2);

Convert the padded sequences to dlarray with format 'CTB' (channel, time, batch). Because formatted dlarray objects automatically sort the dimensions, keep the dimensions of the targets and mask consistent by also converting them to a formatted dlarray objects with the same formats.

dlY = dlarray(Y,'CTB'); targets = dlarray(targets,'CTB'); mask = dlarray(mask,'CTB');

View the sizes of the prediction scores, targets, and the mask.

Compute the Huber loss between the predictions and the targets. To prevent the loss values calculated from padding from contributing to the loss, set the 'Mask' option to the mask returned by the padsequences function.

loss = huber(dlY,targets,'Mask',mask)

loss = 1x1 dlarray

8.1834

Input Arguments

collapse all

Y — Predictions

dlarray object | numeric array

Predictions, specified as a formatted or unformatted dlarray object, or a numeric array. When Y is not a formatteddlarray, you must specify the dimension format using theDataFormat argument.

If Y is a numeric array, targets must be adlarray object.

targets — Target responses

dlarray | numeric array

Target responses, specified as a formatted or unformatted dlarray or a numeric array.

The size of each dimension of targets must match the size of the corresponding dimension of Y.

If targets is a formatted dlarray, then its format must be the same as the format of Y, or the same asDataFormat if Y is unformatted.

If targets is an unformatted dlarray or a numeric array, then the function applies the format of Y or the value ofDataFormat to targets.

Tip

Formatted dlarray objects automatically permute the dimensions of the underlying data to have the order "S" (spatial), "C" (channel), "B" (batch), "T" (time), then"U" (unspecified). To ensure that the dimensions ofY and targets are consistent, whenY is a formatted dlarray, also specifytargets as a formatted dlarray.

weights — Weights

dlarray | numeric array

Weights, specified as a dlarray or a numeric array.

To specify response weights, specify a vector with a 'C' (channel) dimension with size matching the 'C' (channel) dimension of the Y. Specify the 'C' (channel) dimension of the response weights by using a formatted dlarray object or by using the 'WeightsFormat' option.

To specify observation weights, specify a vector with a "B" (batch) dimension with size matching the "B" (batch) dimension ofY. Specify the "B" (batch) dimension of the class weights by using a formatted dlarray object or by using theWeightsFormat argument.

To specify weights for each element of the input independently, specify the weights as an array of the same size as Y. In this case, ifweights is not a formatted dlarray object, then the function uses the same format as Y. Alternatively, specify the weights format using the WeightsFormat argument.

Name-Value Arguments

Specify optional pairs of arguments asName1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: 'NormalizationFactor','all-elements' specifies to normalize the loss by dividing the reduced loss by the number of input elements

TransitionPoint — Point where Huber loss transitions to a linear function

1 (default) | positive scalar

Point where Huber loss transitions from a quadratic function to a linear function, specified as the comma-separated pair consisting of'TransitionPoint' and a positive scalar.

When 'TransitionPoint' is 1, this is also known as_smooth L1 loss_.

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

Mask — Mask indicating which elements to include for loss computation

dlarray | logical array | numeric array

Mask indicating which elements to include for loss computation, specified as adlarray object, a logical array, or a numeric array with the same size as Y.

The function includes and excludes elements of the input data for loss computation when the corresponding value in the mask is 1 and 0, respectively.

If Mask is a formatted dlarray object, then its format must match that of Y. If Mask is not a formatted dlarray object, then the function uses the same format asY.

If you specify the DataFormat argument, then the function also uses the specified format for the mask.

The size of each dimension of Mask must match the size of the corresponding dimension in Y. The default value is a logical array of ones.

Tip

Formatted dlarray objects automatically permute the dimensions of the underlying data to have this order: "S" (spatial), "C" (channel), "B" (batch), "T" (time), and"U" (unspecified). For example, dlarray objects automatically permute the dimensions of data with format "TSCSBS" to have format "SSSCBT".

To ensure that the dimensions of Y and the mask are consistent, whenY is a formatted dlarray, also specify the mask as a formatted dlarray.

Reduction — Loss value array reduction mode

"sum" (default) | "none"

Loss value array reduction mode, specified as "sum" or"none".

If the Reduction argument is "sum", then the function sums all elements in the array of loss values. In this case, the outputloss is a scalar.

If the Reduction argument is "none", then the function does not reduce the array of loss values. In this case, the outputloss is an unformatted dlarray object of the same size as Y.

NormalizationFactor — Divisor for normalizing reduced loss

"batch-size" (default) | "all-elements" | "mask-included" | "none"

Divisor for normalizing the reduced loss when Reduction is"sum", specified as one of the following:

DataFormat — Description of data dimensions

character vector | string scalar

Description of the data dimensions, specified as a character vector or string scalar.

A data format is a string of characters, where each character describes the type of the corresponding data dimension.

The characters are:

For example, consider an array containing a batch of sequences where the first, second, and third dimensions correspond to channels, observations, and time steps, respectively. You can specify that this array has the format "CBT" (channel, batch, time).

You can specify multiple dimensions labeled "S" or "U". You can use the labels "C", "B", and"T" once each, at most. The software ignores singleton trailing"U" dimensions after the second dimension.

If the input data is not a formatted dlarray object, then you must specify the DataFormat option.

For more information, see Deep Learning Data Formats.

Data Types: char | string

WeightsFormat — Description of dimensions of weights

character vector | string scalar

Description of the dimensions of the weights, specified as a character vector or string scalar.

A data format is a string of characters, where each character describes the type of the corresponding data dimension.

The characters are:

For example, consider an array containing a batch of sequences where the first, second, and third dimensions correspond to channels, observations, and time steps, respectively. You can specify that this array has the format "CBT" (channel, batch, time).

You can specify multiple dimensions labeled "S" or "U". You can use the labels "C", "B", and"T" once each, at most. The software ignores singleton trailing"U" dimensions after the second dimension.

If weights is a numeric vector andY has two or more nonsingleton dimensions, then you must specify theWeightsFormat option.

If weights is not a vector, orweights andY are both vectors, then the default value of WeightsFormat is the same as the format of Y.

For more information, see Deep Learning Data Formats.

Data Types: char | string

Output Arguments

collapse all

loss — Huber loss

dlarray

Huber loss, returned as an unformatted dlarray. The outputloss is an unformatted dlarray with the same underlying data type as the input Y.

The size of loss depends on the Reduction option.

Algorithms

collapse all

Huber Loss

For each element Yj of the input, thehuber function computes the corresponding element-wise loss values using the formula

where Tj is the corresponding target value to the prediction Yj and δ is the transition point where the loss transitions from a quadratic function to a linear function.

When the transition point is 1, this is also known as smoothL1 loss.

To reduce the loss values to a scalar, the function then reduces the element-wise loss using the formula

where N is the normalization factor,mj is the mask value for element_j_, and wj is the weight value for element j.

If you do not opt to reduce the loss, then the function applies the mask and the weights to the loss values directly:

Deep Learning Array Formats

Most deep learning networks and functions operate on different dimensions of the input data in different ways.

For example, an LSTM operation iterates over the time dimension of the input data, and a batch normalization operation normalizes over the batch dimension of the input data.

To provide input data with labeled dimensions or input data with additional layout information, you can use data formats.

A data format is a string of characters, where each character describes the type of the corresponding data dimension.

The characters are:

For example, consider an array containing a batch of sequences where the first, second, and third dimensions correspond to channels, observations, and time steps, respectively. You can specify that this array has the format "CBT" (channel, batch, time).

To create formatted input data, create a dlarray object and specify the format using the second argument.

To provide additional layout information with unformatted data, specify the formats using the DataFormat and WeightsFormat arguments.

For more information, see Deep Learning Data Formats.

Extended Capabilities

GPU Arrays

Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

The huber function supports GPU array input with these usage notes and limitations:

For more information, see Run MATLAB Functions on a GPU (Parallel Computing Toolbox).

Version History

Introduced in R2021a