Use Automatic Differentiation In Deep Learning Toolbox - MATLAB & Simulink (original) (raw)

Custom Training and Calculations Using Automatic Differentiation

Automatic differentiation makes it easier to create custom training loops, custom layers, and other deep learning customizations.

Generally, the simplest way to customize deep learning training is to create a dlnetwork. Include the layers you want in the network. Then perform training in a custom loop by using some sort of gradient descent, where the gradient is the gradient of the objective function. The objective function can be classification error, cross-entropy, or any other relevant scalar function of the network weights. See List of Functions with dlarray Support.

This example is a high-level version of a custom training loop. Here, f is the objective function, such as loss, and g is the gradient of the objective function with respect to the weights in the network net. Theupdate function represents some type of gradient descent.

% High-level training loop n = 1; while (n < nmax) [f,g] = dlfeval(@model,net,X,T); net = update(net,g); n = n + 1; end

You call dlfeval to compute the numeric value of the objective and gradient. To enable the automatic computation of the gradient, the data X must be a dlarray.

The objective function has a dlgradient call to calculate the gradient. The dlgradient call must be inside of the function thatdlfeval evaluates.

function [f,g] = model(net,X,T) % Calculate objective using supported functions for dlarray Y = forward(net,X); f = fcnvalue(Y,T); % crossentropy or similar g = dlgradient(f,net.Learnables); % Automatic gradient end

For an example using a dlnetwork with adlfeval-dlgradient-dlarray syntax and a custom training loop, see Train Network Using Custom Training Loop. For further details on custom training using automatic differentiation, see Define Custom Training Loops, Loss Functions, and Networks.

Use dlgradient and dlfeval Together for Automatic Differentiation

To use automatic differentiation, you must call dlgradient inside a function and evaluate the function using dlfeval. Represent the point where you take a derivative as a dlarray object, which manages the data structures and enables tracing of evaluation. For example, the Rosenbrock function is a common test function for optimization.

function [f,grad] = rosenbrock(x)

f = 100*(x(2) - x(1).^2).^2 + (1 - x(1)).^2; grad = dlgradient(f,x);

end

Calculate the value and gradient of the Rosenbrock function at the point x0 = [–1,2]. To enable automatic differentiation in the Rosenbrock function, passx0 as a dlarray.

x0 = dlarray([-1,2]); [fval,gradval] = dlfeval(@rosenbrock,x0)

fval =

1x1 dlarray

104

gradval =

1x2 dlarray

396 200

For an example using automatic differentiation, see Train Network Using Custom Training Loop.

Derivative Trace

To evaluate a gradient numerically, a dlarray constructs a data structure for reverse mode differentiation, as described in Automatic Differentiation Background. This data structure is the_trace_ of the derivative computation. Keep in mind these guidelines when using automatic differentiation and the derivative trace:

Characteristics of Automatic Derivatives

See Also

dlarray | dlgradient | dlfeval | dlnetwork | dljacobian | dldivergence | dllaplacian