NeuralODELayer - Neural ODE layer - MATLAB (original) (raw)

Neural ODE layer

Since R2023b

Description

A neural ODE layer outputs the solution of an ODE.

Creation

Syntax

Description

`layer` = neuralODELayer(`net`,`tspan`) creates a neural ODE layer and sets the Network andTimeInterval properties.

example

`layer` = neuralODELayer(`net`,`tspan`,`Name=Value`) specifies additional properties using one or more name-value arguments.

example

Properties

expand all

Neural network characterizing neural ODE function, specified as adlnetwork object.

If Network has one input, thenpredict(net,Y) defines the ODE system, where net is the network. If Network has two inputs, thenpredict(net,T,Y) defines the ODE system, whereT is a time step repeated over the batch dimension.

The network size and format of the network inputs and outputs must match.

When GradientMode is "adjoint", the network State property must be empty. To use a network with a nonempty State property, set GradientMode to "direct".

Interval of integration, specified as a numeric vector with two or more elements. The elements in TimeInterval must be all increasing or all decreasing.

The solver imposes the initial conditions given by Y0 at the initial time TimeInterval(1), then integrates the ODE function fromTimeInterval(1) to TimeInterval(end).

If TimeInterval has two elements, [t0 tf], then the solver returns the solution evaluated at pointtf.
If TimeInterval has more than two elements, [t0 t1 ... tf], then the solver returns the solution evaluated at the given points [t1 ... tf]. The solver does not step precisely to each point specified in TimeInterval. Instead, the solver uses its own internal steps to compute the solution, then evaluates the solution at the points specified in TimeInterval. The solutions produced at the specified points are of the same order of accuracy as the solutions computed at each internal step.
Specifying several intermediate points has little effect on the efficiency of computation, but for large systems it can negatively affect memory management.

Since R2025a

Solver for neural ODE operation, specified as one of these values:

"ode45" — Solver for nonstiff differential equations using explicit Runge-Kutta (4,5) formula, the Dormand-Prince pair. The"ode45" solver is well suited for most tasks and can be faster and more accurate than other solvers.
"ode1" — Solver for nonstiff differential equations using Euler method. The "ode1" solver uses a fixed step size and can be better suited for code generation and Simulink tasks.

If Solver is "ode1", then theRelativeTolerance and AbsoluteTolerance properties have no effect.

If you specify the SolverOptions property, then the default value matches the corresponding solver. Otherwise, the default value is"ode45".

Since R2025a

Solver options object, specified as a deep.ode.options.ODE45 or a deep.ode.options.ODE1 object.

To set the solver options of the layer, use dot notation. For example, to set the relative tolerance to 1e-4, uselayer.SolverOptions.RelativeTolerance = 1e-4, wherelayer is an instance of the neural ODE layer.

In most cases, you do not need to create the options object directly. If you specify the SolverOptions property, then you must not specify theRelativeTolerance, AbsoluteTolerance, andGradientMode properties.

To see which options the "ode45" and "ode1" solvers support, see deep.ode.options.ODE45 and deep.ode.options.ODE1, respectively.

Method to compute gradients with respect to the initial conditions and parameters when using the dlgradient function, specified as one of these values:

"direct" — Compute gradients by backpropagating through the operations undertaken by the numerical solver. This option best suits large mini-batch sizes or when TimeInterval contains many values.
"adjoint" — Compute gradients by solving the associated adjoint ODE system. This option best suits small mini-batch sizes or whenTimeInterval contains a small number of values.

Tip

To customize the neural ODE solver options, use the Solver and SolverOptions properties. These properties are recommended because they provide additional control over the neural ODE solver.

When GradientMode is "adjoint", the network State property must be empty. To use a network with a nonempty State property, set GradientMode to "direct".

The dlaccelerate function does not support accelerating forward passes of networks with neural ODE layers with the GradientMode property set to"direct". To accelerate forward passes of networks with neural ODE layers, set the GradientMode property to "adjoint" or"adjoint-seminorm", or accelerate parts of your code that do not perform forward passes of the network.

Warning

When GradientMode is "adjoint", all layers in Network must support acceleration. Otherwise, the software can return unexpected results.

When GradientMode is "adjoint", the software traces the ODE function input to determine the computation graph used for automatic differentiation. This tracing process can take some time and can end up recomputing the same trace. By optimizing, caching, and reusing the traces, the software can speed up the gradient computation.

For more information on deep learning function acceleration, see Deep Learning Function Acceleration for Custom Training Loops.

The NeuralODELayer object stores this property as a character vector.

Relative error tolerance, specified as a positive scalar. This tolerance measures the error relative to the magnitude of each solution component. Roughly speaking, it controls the number of correct digits in all solution components, except those smaller than the absolute tolerance AbsoluteTolerance.

At each step, the ODE solver estimates the local error e in the ith component of the solution. To be successful, the step must have acceptable error, as determined by both the relative and absolute error tolerances:

|e(i)| <= max(RelativeTolerance*abs(y(i)),AbsoluteTolerance(i))

If Solver is "ode1", then theRelativeTolerance property has no effect.

Tip

To customize the neural ODE solver options, use the Solver and SolverOptions properties. These properties are recommended because they provide additional control over the neural ODE solver.

Data Types: single | double

Absolute error tolerance, specified as a positive scalar or vector. This tolerance is a threshold below which the value of the solution becomes unimportant. If the solution |y| is smaller than AbsoluteTolerance, then the solver does not need to obtain any correct digits in |y|. For this reason, the value of AbsoluteTolerance should take into account the scale of the solution components.

If AbsoluteTolerance is a vector, then it must be the same length as the solution. If AbsoluteTolerance is a scalar, then the value applies to all solution components.

|e(i)| <= max(RelativeTolerance*abs(y(i)),AbsoluteTolerance(i))

If Solver is "ode1", then theAbsoluteTolerance property has no effect.

Tip

To customize the neural ODE solver options, use the Solver and SolverOptions properties. These properties are recommended because they provide additional control over the neural ODE solver.

Data Types: single | double

Examples

collapse all

Create a neural ODE layer. Specify an ODE network containing a convolution layer followed by a tanh layer. Specify a time interval of [0, 1].

inputSize = [14 14 8];

layersODE = [ imageInputLayer(inputSize) convolution2dLayer(3,8,Padding="same") tanhLayer];

netODE = dlnetwork(layersODE);

tspan = [0 1]; layer = neuralODELayer(netODE,tspan)

layer = NeuralODELayer with properties:

             Name: ''
     TimeInterval: [0 1]

Learnable Parameters Network: [1×1 dlnetwork]

Solver properties GradientMode: 'direct' RelativeTolerance: 1.0000e-03 AbsoluteTolerance: 1.0000e-06 Solver: ode45

Show all properties

Create a neural network containing a neural ODE layer.

layers = [ imageInputLayer([28 28 1]) convolution2dLayer([3 3],8,Padding="same",Stride=2) reluLayer neuralODELayer(netODE,tspan) fullyConnectedLayer(10) softmaxLayer];

net = dlnetwork(layers)

net = dlnetwork with properties:

     Layers: [6×1 nnet.cnn.layer.Layer]
Connections: [5×2 table]
 Learnables: [6×3 table]
      State: [0×3 table]
 InputNames: {'imageinput'}
OutputNames: {'softmax'}
Initialized: 1

View summary with summary.

Since R2025a

Create a neural ODE layer. Specify an ODE network containing a convolution layer followed by a tanh layer. Specify a time interval of [0, 1].

inputSize = [14 14 8];

layersODE = [ imageInputLayer(inputSize) convolution2dLayer(3,8,Padding="same") tanhLayer];

netODE = dlnetwork(layersODE);

tspan = [0 1]; layer = neuralODELayer(netODE,tspan)

layer = NeuralODELayer with properties:

             Name: ''
     TimeInterval: [0 1]

Learnable Parameters Network: [1×1 dlnetwork]

Solver properties GradientMode: 'direct' RelativeTolerance: 1.0000e-03 AbsoluteTolerance: 1.0000e-06 Solver: ode45

Show all properties

Specify an initial step size of 1e-3 and a maximum step size of 1e-2.

layer.SolverOptions.InitialStep = 1e-3; layer.SolverOptions.MaxStep = 1e-2;

View the solver options.

ans = ODE45 with properties:

      InitialStep: 1.0000e-03
          MaxStep: 0.0100
RelativeTolerance: 1.0000e-03
AbsoluteTolerance: 1.0000e-06
     GradientMode: 'direct'

Tips

To apply the neural ODE operation in deep learning models defined as functions or in custom layer functions, use dlode45.

Algorithms

expand all

The neural ordinary differential equation (ODE) operation returns the solution of a specified ODE. In particular, given an input, a neural ODE operation outputs the numerical solution of the ODE y′=f(t,y,θ) for the time horizon (t0,t1) and with the initial condition y(t0) = y0, where t and y denote the ODE function inputs and θ is a set of learnable parameters. Typically, the initial condition y0 is either the network input or the output of another deep learning operation.

To apply the operation, NeuralODELayer uses the ode45 function, which is based on an explicit Runge-Kutta (4,5) formula, the Dormand-Prince pair. It is a single-step solver—in computing_y(tn)_, it needs only the solution at the immediately preceding time point, y(tn-1) [2] [3].

Layers in a layer array or layer graph pass data to subsequent layers as formatted dlarray objects. The format of a dlarray object is a string of characters in which each character describes the corresponding dimension of the data. The format consists of one or more of these characters:

"S" — Spatial
"C" — Channel
"B" — Batch
"T" — Time
"U" — Unspecified

For example, you can describe 2-D image data that is represented as a 4-D array, where the first two dimensions correspond to the spatial dimensions of the images, the third dimension corresponds to the channels of the images, and the fourth dimension corresponds to the batch dimension, as having the format "SSCB" (spatial, spatial, channel, batch).

You can interact with these dlarray objects in automatic differentiation workflows, such as those for developing a custom layer, using a functionLayer object, or using the forward and predict functions withdlnetwork objects.

This table shows the supported input formats of NeuralODELayer objects and the corresponding output format. If the software passes the output of the layer to a custom layer that does not inherit from the nnet.layer.Formattable class, or aFunctionLayer object with the Formattable property set to 0 (false), then the layer receives an unformatted dlarray object with dimensions ordered according to the formats in this table. The formats listed here are only a subset. The layer may support additional formats such as formats with additional "S" (spatial) or"U" (unspecified) dimensions.

If TimeInterval contains more than two elements, then the layer outputs data with a "T" (time) dimension.

Input Format	TimeInterval	Output Format
"CB" (channel, batch)	[t0 tf]	"CB" (channel, batch)
[t0 t1 ... tf]	"CBT" (channel, batch, time)
"SCB" (spatial, channel, batch)	[t0 tf]	"SCB" (spatial, channel, batch)
[t0 t1 ... tf]	"SCBT" (spatial, channel, batch, time)
"SSCB" (spatial, spatial, channel, batch)	[t0 tf]	"SSCB" (spatial, spatial, channel, batch)
[t0 t1 ... tf]	"SSCBT" (spatial, spatial, channel, batch, time)
"SSSCB" (spatial, spatial, spatial, channel, batch)	[t0 tf]	"SSSCB" (spatial, spatial, spatial, channel, batch)
[t0 t1 ... tf]	"SSSCBT" (spatial, spatial, spatial, channel, batch, time)
"SC" (spatial, channel)	[t0 tf]	"SC" (spatial, channel)
[t0 t1 ... tf]	"SCT" (spatial, channel, time)
"SSC" (spatial, spatial, channel)	[t0 tf]	"SSC" (spatial, spatial, channel)
[t0 t1 ... tf]	"SSCT" (spatial, spatial, channel, time)
"SSSC" (spatial, spatial, spatial, channel, batch)	[t0 tf]	"SSSC" (spatial, spatial, spatial, channel, batch)
[t0 t1 ... tf]	"SSSCT" (spatial, spatial, spatial, channel, batch, time)
"SB" (spatial, batch)	[t0 tf]	"SB" (spatial, batch)
[t0 t1 ... tf]	"SBT" (spatial, batch, time)
"SSB" (spatial, spatial, batch)	[t0 tf]	"SSB" (spatial, spatial, batch)
[t0 t1 ... tf]	"SSBT" (spatial, spatial, batch, time)
"SSSB" (spatial, spatial, spatial, batch)	[t0 tf]	"SSSB" (spatial, spatial, spatial, batch)
[t0 t1 ... tf]	"SSSBT" (spatial, spatial, spatial, batch, time)
"SS" (spatial, spatial)	[t0 tf]	"SS" (spatial, spatial)
[t0 t1 ... tf]	"SST" (spatial, spatial, time)
"SSS" (spatial, spatial, spatial)	[t0 tf]	"SSS" (spatial, spatial, spatial)
[t0 t1 ... tf]	"SSST" (spatial, spatial, spatial, time)

Version History

Introduced in R2023b

expand all

Specify the solver using the Solver property. To further customize the "ode45" or "ode1" solvers, use theSolverOptions argument.